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Preface 


This book has evolved from certain courses on optimization that we have been teaching at 
LLT. Delhi for the last so many years. 

The main aim of this book 1s to provide a focussed and detailed study of various numerical 
optimization methods and their applications in science, engineering, finance and management. 
Apart from discussing standard optimization methods, the book includes some very recent 
topics like semi-definite programming, second order cone programming and evolutionary 
methods for global optimization. An attempt has also been made to present some modern and 
nonconventional applications of numerical optimization in the areas of machine learning, 
financial mathematics, and VLSI design. A distinctive feature of the book is also to provide 
basic MATLAB codes as building blocks for readers to develop their own codes for various 
algorithms discussed in the book. 

This text has been designed for two, one semester courses of basic study in numerical 
optimization with applications for B.Tech, M.Tech, M.Sc, and M.Phil (Mathematics/Statistics/ 
Operations Research/ Computer Sciences); and M.B.A students who opt for courses in 
optimization/operations research. If in certain universities/institutes, there is only one, one 
semester course on optimization/operations research, then appropriate topics can be chosen 
by the instructor as per his/her requirements. Some universities offer a course on linear 
programming and game theory at the undergraduate level as well. These topics are also 
covered in the book. Further, the book can also be used as a reference for researchers and 
practitioners. 

An interesting and useful feature of this book is the examples and end chapter exercises. 
Within each chapter, detailed numerical examples and graphical illustrations are included 
for better understanding of concepts and working of various algorithms. Certain objective 
type exercises have been specially constructed to check subtle aspects of theory and algorithms. 
Each chapter ends with a summary and additional notes section, which tries to provide some 
recent references for further study. 

Although every care has been taken to make the presentation error free, some errors may 
still remain and we hold ourselves responsible for that. The readers are requested to kindly 
communicate errors, if any, at chandras@maths.iitd.ac.in (e-mail address of S. Chandra). 

Inthe long process of writing this book we have been encouraged and helped by many 
individuals. In particular we would like to thank Professors C. R. Bector, S. K. Gupta, S. C. 
Duttaroy, N. S. Kambo, R. N. Kaul, T. R. Gulati, S. K. Neogy, Joydeep Dutta,B. Chandra, B. 

= S. Panda, S. Dharmaraja, D. Bhatia, S. R. Arora, S. K. Suneja, C. S. Lalitha and Pankaj 

= Supta, for their suggestions and interest in this book. We gratefully acknowledge the book 

grant provided by IIT Delhi and thank the Director, Prof. S. Prasad and the Deputy Director 
(Faculty) Prof. B. N. Jain for their help in this regard. 

A very special : hanks and appreciation are due to our Ph.D students Ms. Reshma 

mchandan; Me Neenalj Gupta. Ms. Anulekha Dhara and Mr. Dhirendra Singh Yadav 
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1 
Introduction 


————————— 


1.1 An Overview of Optimization 


Optimization constitutes a very important branch of modern applied mathematics. A 
variety of problems arising in the areas of engineering design, operations research, man- 
agement science, computer science, financial engineering and economics can be modeled 
as optimization problems. ‘There are three major aspects of optimization which are iden- 
tified as theory, algorithms and applications. It has been possible to use optimization 
in real life applications because of the availability of efficient algorithms, and these al- 
gorithms have been developed because of certain interesting research in optimization 
theory. Thus, there is very strong linkage between theory and algorithms on one hand, 
and algorithms and applications on the other hand. Another aspect of optimization 
which is equally important is the development of good software for various algorithms. 
Time has proven that the practical value of an algorithm is largely determined by its 
numerical performance, robustness, and ease of computer implementation. Therefore 
developing efficient and reliable software for optimization algorithms is certainly very 
critical. 

Though the existence of optimization techniques can possibly be traced to the era 
of Isac Newton, L. Euler, J. L. Lagrange and A. L. Cauchy, it was the development of 
the simplex method for linear programming by G. B. Dantzig in the mid 40’s which, 
in a sense, started the subject of mathematical optimization the way we understand 
it today. Another major development was due to H. W. Kuhn and A. W. Tucker in 
1951 who gave necesaary /sufficient optimality conditions for nonlinear optimization or 
nonlinear programming problems. These conditions are now called as Karush-Kuhn- 
Tucker (KKT) conditions because W. Karush in 1939 had already developed conditions 
similar to those given by Kuhn-Tucker. 

The algorithmic development in nonlinear optimization got a boost due to the pio- 
neering work by W. C. Davidon, R. Flecther and M. J. D. Powell in late 50’s in the area 
of unconstrained optimization. The sequential unconstrained minimization technique 
(SUMT) developed by A. V. Fiacco and G. P. McMormick in mid 60’s is still consid- 
ered very efficient for solving nonlinear optimization problems. In 1984, N. Karmarkar 
5 developed a polynomial time algorithm for linear programming which not only renewed 


© D E 


en 












2 Numerical Optimization with Applications 


researchers for solving large linear programming problems ba 
K (as , pm be 


jon of many ER are thc thods for solving s : 
attenti for the powerful interior point methods for solving structured t ú 


OQ Sl le CO ; Jé 
BaS ee ims, The area of applications has also gone through a major cha ‘ ann 
i Betton to the traditional applications in engineering and management, there 4 | able 
oars | =f OT ay ; dr pr 
E eie machine learning, embedded eens, VLSI, and portfolio management Which We 
routinely use latest optimization algorithms. obtair 
1.2 Optimization Problems 
We start with a generic description of finite dimensional deterministic optimization 
problem. Let f, gi, (i = 1,2,...,m) and hj, (j = 1,2,...,p) be functions from R" to R tt 
and D be a set in R”. Then an optimization problem can be stated as cob 
oL O 
Min f(x) in (1 
subject to fon 
i func 
CORSO (= L 2m) the 
Oe OG 1,2). «./p) linei 
re Oe (1.1) the 
f to I 
where x € R” is the vector of unknown decision variables. velc 
A finite dimensional optimization problem as described above is also called a math- gen. 
oe programming problem. Here the function f is called the objective function and dev 
: e functions 8, @=1,2,...,m) and hj, (j =1,2,...,p) are called the constraint func- 
ions. Without any loss of generality, problem (1.1) could also have been stated in the pri 
maxımızation from because Max( f (x))=-Min(- f (x)). 
z pomt x € R” satisfying all the constraints of problem (1.1) is called a feasible a 
a = a i: set of all feasible solutions is called the feasible set or the feasible 
4 EN - aie by S of is soap then the problem is called infeasible. If it is 
pequence x"; € S such that, { EOD E (k) 
ee =z x th 
eee case), then the problem is called unbounded uh Gee ee ins 
point x € S is called a global min po; 4) (2 
ras bi ici ne problem (1.1) if fŒ) < f(x) for all x € S. If 
point x € S such that TAK / S called a strict global min point. If there exists a 
X), No(x) = {x € R” : ||x—X]] < 5} being 
ed a local min point of the problem. 
The definitions of a global maz point, a (; 
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problem and also to i ible region consists of all points (x1, X2) € R* which Satisfy 
) æ. Here the feasible reg i E . : 
(4) given above. Here sacs Hint 2 A (x2 2. 3) represents a pal abola with vertex (0,-3) 
he given constraints. Noting that x; Ss Titer É 1. 
ipl isualize the feasible region as shown in Fig 1. 
it is simple to visualize the fee 








Now in the feasible region S, we wish to find (%1,%2) for which the value of the 
— 3)? + (x5 — 25 


S least. For this, we observe that the contours 
on, namely (xı — 3)2 + (x2 — 2)2 


). Therefore to find 
le with the Smallest 


intersects the feasible region at (2.1 Th 
and the Optimal value is 2. g : <a ‘ 
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Introduction 5 


1.3 Some Simple Illustrative Examples 


In this section we present some simple illustrative examples of linear and nonlinear op- 
timization problems. Though these examples could not be labeled as ‘realistic’, they 
are good enough to convince us the applicability of numerical optimization. A more 
detailed discussion on the applications of mathematical programming in areas like ma- 


chine learning, portfolio optimization and engineering design is presented in Chapters 
16, 17, and 18, respectively. 





Cargo-Loading Problem 


Let there be a vessel which is to be loaded with stocks of N items. Also let each unit 
of item į have a weight w;, a volume gj, and a value v; for i=1,2,...,N. The cargo has 
weight and volume capacity of W and Q respectively. We wish to determine the number 
of units x; of each item i (i = 1,2,...,N) to be loaded on the vessel so that the total 
value of the cargo is maximum. 

The mathematical formulation of this problem is 


Max alge 
subject to 
T qixi £ Q 
we, and integer 0 — 1/2)... N): (128) 


Problem (1.3) is a representative of the general class of optimization problems, known 
as knapsack problems. If we do not have integer constraints on the decision variables xj, 
then it becomes a linear programming problem. However if we restrict x;’s to be integer 
then problem (1.3) becomes an integer linear programming problem. 


Multi-stage Compressor Problem 


In an N-stage unit, an ideal gas has to be compressed from an initial pressure Po to a 
final pressure Px. Let it be known that for an N-stage unit, the energy Ey be given by 


P ) = j ( Py 7 
= SSS Pa tea S N 7 
Eaa (5 Ri Py-1 


SA 












with Applications 


Optimization ‘ete discharge pressures p Ly P25 0 
6 Numer he intermedia iC ae process is mini k. 
s to determine ‘i total energy En in the process 18 minimum, If 
yblem 15 * | at the tota 
‘he spoblem p 0O that j T D 
oP 2 F Py-1 $ Py) 8 N then we get the following formulation 
j <» R à r — —" 4 r 
(Po £" Pi (i= L2. N) anc 


` N ve SS poan 
we define +1 Pi- 


to minimize EN, ; 
A 
Min KY 1) 
i=! 


subject to 
N = 


which is a nonlinear optimization problem. 


Cutting Stock Problem 


Suppose that metal sheets are produced in rolls of standard fixed length / and standard 
width w. Further suppose that a customer places a large or der for sheets of width w but 
of varying lengths. Specifically the order requires b; sheets with length i, (@=1,2,..., m) 
and width w. To meet this order, standard sheets are to be cut in a manner so that the 
demand is satisfied and the wastage is also minimized. If we assume that the scrap pieces 
are of no use, then the problem boils down to minimize the number of roles needed to 
satisfy the order. 

Now given a standard sheet of length 1, there are many ways of cutting it. We call 
each of these ways as cutting pattern, and characterize the “his cutting pattern by the 
column vector a0) whose i component Aij is a non-negative integer denoting the number 
of sheets of length l; in the j pattern. Obviously the vector q‘/) 
pattern if Xj- aijl; < l and each Aij 
because the number of cutting patt 
denote the number of standard rol 
following formulation to minimize 


| represents a cutting 
1S a non-negative integer. We can take E 
erns is finite which we denote by n. If we now let Xj 


ls cut according to the ee pattern, then we get the 
the waste and satisfy the order 


n 


Max D Xj 
j=1 
subject to 
n 
} ai a e Hi) 
J=1 


xi > 0 and integer GS N) 


Problem (1.4) 


'S an integer linear programming prob] 
em. 


(1.4) 
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h is a NLP but can be transformed to a linear programming problem. 
which 1s 4: 


Production Planning Problem 


Consider a firm which produces a certain product and it wishes to plan its Produc. 
tion schedule over a period of time in an optimal manner. Here it is assumed that the 
demand function is known over the time horizon and this demand must be met. Also 
it is assumed that the storage cost is proportional to the amount stored, and there is 
a production cost associated with a given rate of production. To have a mathematica] 


model of this problem we introduce the following notations 


x(t) =stock held at time t 
r(t) = rate of production at time t 


d(t) = demand at time t 


d 
x(t) = = | | 
Then the production system can be described by the following differential equation 


X(t) = r(t)—d(t), t€[0,T], x(0) = xo (given). (1.7) 


But (1.7) can equivalently be described by the integral equation 


f 
x(t) = x(0) + i: [r(s) — d(s)]ds , 


with r(t) > 0, x(t) > 0 forte [0, T]. 

Ri A u poe A 7 T production level r, c(r) is the production cost rate and 
ventory level x, h(x) is the i 

Re ee nventory cost rate, then the cost function to be 


ffi 
i (c(r(t)) + h(x(t))) dt. 


Therefore we get the following optimization problem 


Min fi (cot) + h(x(t))) a 


Subject to 


x0 + [ (rs) -d(s))ds > 0 0<t<T), 
r(t) > 0 (<7 7), (1.8) 
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x 12 Numerical Optimization with Applications 


Max(or Min) z = ¢1%1 + C2%2 +... + CyXp 
subject to 
11X1 + A12%3 +.. KAY yXy (3, =, 2) by 


Ay X, + 22X27 + + A2nXn (<, =, 2 ) bz 


Ay X4 + Am2X2 yes + AmnXn (s, =, 2 )by 
X1 Z Ua sn U (2.1) 


Here it is understood that only one of the three symbols ‘<’, ‘=’, ‘>’ holds in the first m 
constraints and these might be different for different constraints. Also we can consider 
the problem either in the maximization form or in the minimization form, because either 
form can be converted into the other form by noting that for an arbitrary real valued 
function f defined over a domain D S R”, max f(x) = —min(—f(x)). In our presentation 
we shall mostly take the LPP in the maximization form. 

To have some idea about a possible procedure for solving LPP’s, we consider the 
following problem in two variables 


















i A Max z = 4x1 + 3x2 
subject to 
i i X1+%2 <8 
i | ; eo 2X1 + X2 < 10 
cy ee x1, X2 > 0. (2.2) 
| ae fees 
a f ey = a = _ linear programming problem, there are only two decision variables so a number 
oO of bas c facts about a general LPP can possibly be understood geometrically and prov“ 


la ter an; alytically. For this let us try to identify the set of all point (x1, x2) € R? satisfying 
she constraints of the given LPP. To start with let us take the first constraint xı + X2 = <8 
d ‘mre that the line x1 +x2 = 8 divides R? into two half spaces, one for which x;+*2 £ <6 
ther for eg xı + X2 2 8. To identify which inequality sign corresponds to which 

e look the position of a specific point, say (0,0). As x1 + x2 < 8 holds for 
ints s on or below the line x; + x2 = 8 will meet the constraint x; + x2 £ 8 ” 
g 2.1. Similarly we show the half spaces as given by the other constraints 


oints isfying all the constraints is the shaded region in the ae 
in Fi “a os S is called the feasible region or feasible set >” 
Ax, + xa = k for various values of k, we shall get 4 a family 
ceding oe | in which the objective functio” ‘ 
| depicted by a arrow in Fig 2.1. Here it show! 
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4x43 Xk 2x, + x,=10 X+ X= 8 


Fig. 2.1. 


be noted that because of linearity of the objective function, the gradient is constant 
and therefore once we know the direction of increase (decrease), in that directions the 
a function is going to increase (decrease) for all time to come. This, in general, 
is not going to happen for nonlinear function as there the gradient depends on the 
point x, and therefore the function may increase (decrease) for sometime and then start 
- decreasing (increasing). This suggests that for locating the optimal solution of the given 
Fai: LPP, one has simply to slide the line of the objective function over the feasible region and 
continue sliding till it is possible. In doing so for our example we observe the following 
f ple but important facts 
The feasible region is a closed set, which is bounded by straight lines, and has finitely 
1a any corner points. Also if we take two points in the feasible region then the whole 
ine P joining them remains in the feasible region; i.e. it is a convex set. Thus 
feasible region is a closed, bounded convex set having finitely many corner points. 
1c ha set is called a polytope. 
| e op imal point to the above problem can not be an interior point. In fact it is a 
re po voint as shown in Fig 2.1. 
at ae above, for locating the maximizing (or minimizing) point of a LPP it 
cal 1 to apd all corner points of the feasible region and choose the one at 
ive function value is maximum (or minimum). This method of solving 
she e graphical method. 
s xt of the example taken here, the four corner points of the feasible region 
R ; e D. P: (5, 0), and Q : (2,6) and the objective function value is 
5 2(0,0) = 0 26,0) = 20, 2(2,6) = 26, and 2(0,8) = 24. Thus x* = 2 
timal solution and z* = 26 is the optimal value. 
line of objective function over the feasible region for a genera] LPP, 
i appen a that we hit only one corner point or sliding will always come 
. if the feasible region of the LPP is unbounded, or jt is empty or 
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2.3 The Simplex Method as an Algebraic Version of The Graphical 
Method 


In the last section we have observed that the graphical method cannot be 
LPP’s that have more than two (or at best three) decision variables be 
not visualize the geometrical object ‘corner point’ for such problems. 

Here it is important to note that this is not a limitation of the method as such, by 
rather our inability to visualize geometrical objects beyond three-dimensions. Therefor, 
it seems natural to look into the possibility of understanding the graphical method 
algebraically. In fact this ‘algebraic translation’ of the graphical method is possible 
and that is essentially the well known simplex method for linear programming problem 
developed by G.B. Dantzig in 1947. 

To understand the simplex method, we need to understand the concept of a basic 
feasible solution because as we shall see later, this is precisely the algebraic analogue 
of the geometrical concept ‘corner point’. For this we need to express the given LPP in 
6 the standard form by introducing appropriate number of slack and surplus variables at 
fist appropriate places. This is done to convert linear inequalities into linear equations so 
that traditional linear algebra becomes applicable. 


In the following we now take an example and describe how the given LPP can be 
expressed in the standard form. Let the problem be 


applied fo 
cause one cay 





se 
















Max Z = 2x1 + 3X2 
subject to 

3x1 +X <3 

4x1 + 3x > 6 

Xi F 2x7 = 
iyo = 0, (2.3) 
Here the first inequality is with ‘<’ sign so we have to add a non- 
say x3, so that this can be written as 3X1 +X. + x3 = 3 


Similarly the second constraint is with ‘2’ sign and so we have to subtract a non- 
negative variable, say x4, so that this can be written as 4x, + 3x2 — x4 =6 
Variables like x3 which are nonne 
called the slack variables. 
Similarly, variables like x4 which are nonnegative and are subtracted from ‘>’ type 
= constraints are called the surplus variables. 
No w taking the coefficients of the slack and surplus variables as 
function. we get another linear programmir 4g problem as follows 


a= 


negative variable, 


gative and are added to ’<’ type constraints are 


zero in the objective 
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Max Z = 2x1 + 3x2 + Oxy + 0x4 
subject to 
3X1 + X2 +x3 =3 
4x1 + 3x2 = x4 = 6 
X1 + 2X, =3 
Xir XD, Xe, AA Wi (2.4) 





There is a very close connection between the linear programming problems (2.3) and 
(2.4), which we state in the form of below given results. 


Result 2.3.1 There is a one to one correspondence between the feasible solutions of 
problems (2.3) and (2.4), in the sense that if (x1,x2) is feasible to (2.3) then there exist 
unique X3 and x4, namely, x3 = 3 — 3x, —x2 and x4 = 4x1 + 3x2 — 6 so that (x1, X2, x3, X4) 
is feasible to (2.4); and conversely if (x1,x2,x3,x4) is feasible to (2.4) then (x1,X2) ts 


feasible to (2.3). 


Result 2.3.2 Let (xt,x x) be optimal to (2.3). Then there exist C= 3 — Ox 
and x} = Axy + 3x3 — 6 such that (x¥,x5,x3,X,) ts optimal to (2.4). Conversely, if 
(%1,X2,X3,X4) is optimal to (2.4) then (X1,X2) is optimal to (2-3): 


The proof of the Result 2.3.2 follows from Result 2.3.1 and the fact that in problem 
(2.4), coefficients of the slack and surplus variables in the objective function are taken 
as zero. Here it may be remarked that though above results are stated in the context of 
an example only, it is clear that these hold for a general linear programming problem 
as well. Some readers may certainly like to prove this last statement themselves. 


LPP in Standard Form 


In view of Results 2.3.1 and 2.3.2, we can assume without any loss of generality that all 
| constraints of the given LPP are with °=’ sign except (xj > 0,j =1,2,...,n). Thus any 


| LPP can be taken in the form 
Max Z = CX + C2X2 + ... + CnXn 


subject to 














Aj1X1 + 412%2 +... + AlnXn = bı 
A21 X1 a A22 X2 +... + A2nXn = b> 






Ai X1 + Am2X2 +--+ + AmnXn = Din 


a n,m (I 
o LAN E 7o, 
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x = col(x1,X2, TT g) 
b = col(by, b2,..., bm) 
C = GOLO C. » Cr) 


a11 A11 es Ain 
A2) a22 An 

Ave j 
Ami Am2 eee Amn mxn 


then the above LPP can be expressed as 


Max Z= x 
subject to 
Ax =p 
Gea 


Here xe R”, beER™, ce R", A= (aij) € R”*", and the vector inequality x > 0 is 
understand as x; 20,j =1,2,...,n. 

In the above, we can further assume that b > 0, because if some component, say 
bi, is < 0, then the i equality can be multiplied by -1 to get b; > 0. Also it can be 
assumed that Rank A = m(< n), because if this is not true then the system Ax = b is 
either redundant or has unique solution. Therefore, without any loss of generality we 
can assume that the given LPP has the following form 


Max B= ex 
subject to 
ile = la 
x= 0, (2.5) 


with (i) b 2 0 and (ii) Rank A = m(< n). Any LPP having this form with these two 
conditions being satisfied is called an LPP in the standard form. It is simple to argue 
that any given LPP can be taken in the standard form without any loss of generality. 

If we now refer to problem (2.3) then we note that problem (2.4) is in the standard 
form with 





x = col(x1, X2,x3,X4), Cc =col(2,3,0,0), b=col(3,6, 3) ang 


o> tL 1 0 
A=|4 3 0 -1]. 
Liem? A) e 


Here b > 0 and Rank (A) = 3(< 4). 


L 7 
€ ka 
iJ A 
ke S T 7 ae E a 
@ ia J. ~ 
’ Ja -. À =a © 
| 2 aon A e 
_ > 
S sp es i 
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Basic Feasible Solution 


Now we proceed to introduce the main concept of this section, namely the basic feasible 
goeition, written in short as b.f.s, for the system Ax = b, x > 0 with Rank A = m(< n). 
Let a ie L2, tie ee the es column of A, i.e. a0) = col(ay j,A2j,-++14mj) and 
A= la ra, a,.n AYY]. As Rank A = m(< n), there certainly exists a set of m lin- 
early independent columns of A. We get hold of one such set of m linearly independent 
columns of A and call these columns as basic columns. The remaining (n — m) columns 
of A are called non-basic columns. We next form an (mX m) matrix B consisting of these 
m basic columns, and call it a basis matrix. Here we may note that there may be more 
than one basis matrix for the system Ax = b as the choice of m linearly independent 
columns of A is not unique in general. It is simple to argue that the number of basis 
matrices is always less than or equal to "Cy. 

We shall now define the basic solution of the system Ax = b for a given basis matrix 
B. If the basis matrix B is changed to get a new basis matrix B4, then the basic solution 
will also change as the basic solution is defined for a given basis matrix only. Let us now 

assume that a basis matrix B is given. We note that there is a one to one correspon- 

' dence between the columns of A and the components of the vector X. For example x 

` corresponds to a”), x» corresponds to a) and in general x j corresponds to a), Therefore 
we can identify those components of vector x which correspond to basis columns, 1.€. 
columns of B. These components of x, which are m in number, are called baste variables 
and remaining (n —m) components of x are called non-basic variables. If we now put all 
(n-m) nonbasic variables as zero and solve the system Ax = b for the m basic variables 
then the solution so obtained is called the basic solution for the given basis matrix B. 
Some readers may note here that this is a well known result in matrix theory which 
states that a system of m linear equations in n unknowns, with Rank A = m(< n), has 
infinitely many solutions depending on (n — m) parameters. For the definition of the 
basic solution, these (n — m) parameters, namely (n — m) nonbasic variables, are given 
the zero value. 

Next we define a basic feasible solution (b.f.s). For this, we consider the system 
Ax = b, x > 0. Let B be the given basis matrix. Then we can obtain the basic solution 
of Ax = b for the given basis matrix B using the procedure as described above. If in this 
basic solution all basic variables are non-negative then this is called the basic feasible 
solution for the basis matrix B. As all nonbasic variables are zero by definition, a solution 
is a b.fs if it is both basic and feasible. For a given basic feasible solution if all basic 
variables are strictly positive, then it is called a nondegenerate b.f.s, otherwise (i.e. when 
some basic variable takes the value zero) it is called a degenerate b.f.s. 

Since for the system Ax = b with Rank(A) = m(< n), there will be at most "Cm basis 
matrices, the number of basic solutions for the given system is always less than or equal 
to “Cm. As every basic feasible solution has to be basic solution, the number of basic 
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feasible solutions for the system Ax = b, x > 0 with Rank(A) = m(< n) is also less than 
or equal to "Cy. 


Example 2.3.1 Consider the system 


Xi + x2 + X3 = 8 
2X1 +x2+x4=10 
My Xo, Kay a AO 


and find all basic solutions. Also identify those basic solutions which are basic feasible. 
Are there any degenerate b.f.s ? 


Solution Here we identify 


x= col(x1,x2,X3,X4) 
b = col(8, 10) 
(Pel aha, 
and A= > 10 i 


We shall first find all possible basis matrices. As Rank A = 2, any (2 x 2) submatrix of 

A whose determinant is not zero will be a basis matrix. In the matrix A there are six 

basis matrices and so we shall obtain six basic solutions (one for each basis matrix) for 

the given system of linear equations. We consider these basis matrices one by one now. 
| ie : : 

(i) By = | 2 i As |Bı| # 0, Bı is a basis matrix. Also xı and xz are basic variables 
and, x3 and x4 are nonbasic variables. Therefore setting x3 = 0, x4 = 0 and solving 
the resulting system for the basic variables x; and % we vet X1 = 2, x» = 6. This 
gives the basic solution corresponding to the basis matrix By as 4 = 2 doe 
6, x3 = 0, x4 = 0). Further as both basic variables are strictly positive this is a 


non-degenerate basic feasible solution. 

2 iy =I 

(ii) B2 = 2 of As |B2| + 0, B2 is a basis matrix. Here xı and x3 are basic variables 
and, x2 and x4 are nonbasic variables. Therefore setting x2 = 0, x4 = 0 we get x; = 
5, x3 = 3. This gives the basic solution corresponding to the basis matrix B2 as 
(x1 = 5,X2 = 0,x3 = 3,x4 = 0) which is again a non-degenerate basic feasible solution. 


Pa 0 ! s . 
(iii) B3 = > 1/| As [Bl # 0,B3 is a basis matrix. Here xı and x4 are basic variables 


and, x2 and x3 are nonbasic variables. Therefore setting x2 = 0, x3 = 0 we get x1 = 
8, x4 = —6. This gives the basic solution corresponding to the basis matrix B3 as 
(x1 = 8, x2 = 0, x3 = 0, x4 = —6) which is not a basic feasible solution as x4 is negative. 


P| d D E 

A. -Coia x [S hs SC 

= < i . | A [hop yo! fee `m yee -L . 

, — i ( A ` Fas 1 oÁ ia} j OL IC > U Ea “5 lat Seren 
af E af 448 [D4] F U, D4 1s a basis matr: 


SIS Matrix, 
ME ek By 
J — 






Hea re X2 and x3 are basic variables — 
p =) 


> ‘eee » 
OELE a le 
r af =) ala 


iting x; = 0, x4 = 0, we get, 


Pe) o 
= ee 
. dans DO 


IS 
t 
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x = 10, x3 = —2. ‘This gives the basic solution corresponding to the basis matrix 
Bg as (x1 = 0,X2 = 10,x3 = —2,x4 = 0) which is fot wr baste feasible solution as x3 is 
negative. 


i U 
(v) B5 = 1 it As |Bs| + 0,Bs is a basis matrix. Here x2 and x4 are basic variables 
and, xı and x3 are nonbasic variables. Therefore setting x; = 0, x3 = 0 we get X2 = 
8, x4 = 2. This gives the basic solution corresponding to the basis matrix Bs as 
(x; = 0,x2 = 8,x3 = 0, x4 = 2) which is a non-degenerate basic feasible solution. 


l UE 
(vi) Be = 0 1l As |Be| # 0, Bg is a basis matrix. Here x3 and x4 are basic variables 


8, x4 = 10. This gives the basic solution corresponding to the basis matrix Bg as 


and, x; and xz are nonbasic variables. Therefore setting x; = 0, x2 = 0 we get x3 = 
i 
(x, = 0, x2 = 0, x3 = 8, x4 = 10) which is a non-degenerate basic feasible solution. 


Thus in the above example we have six basic solutions and out of these only four are 
basic feasible solutions. Further all b.f.s are non-degenerate. 
At this stage let us stop for a moment and go back to problem (2.2). In Fig 2.1 
we observe that the feasible region has exactly four corner points while our discussion 
here for the corresponding constraints in the standard form, gives us exactly four basic 
feasible solutions. Is it just a coincidence or there is something more in this observation? 
Well, it is not a coincidence. Infact we have obtained the algebraic analogue of the 
geometrical object corner point in the form of a basic feasible solution, i.e.‘corner point’ 
and b.f.s are one and the same. It is just the way of looking them is different, as one 
is geometric in nature where as the other is algebraic in nature. Also using Result 
931 and Result 2.3.2, we can identify the specific corner point for the given b.f.s. Let 
us take the b.f.s (x1 = 2,X2 = 6,%3 = 0,x4 = 0) corresponding to the basis matrix 
B4. It corresponds to the corner point (2,6) because to get the feasible point for the 
given LPP from a feasible point of the corresponding LPP in the standard form, we 
have to simply ignore the slack and surplus variables. In a similar manner, the b.f.s 
Ga = 5,70 = 0,x3 = 0,x4 = 0) corresponds to the corner point (5,0). Here it should be 
noted that this correspondence does not hold for all basic solutions, it holds for b.f.s only. 
Thus for the basic solution (x; = 8,%2 = 0,x3 = 0,x4 = —6) there is no corresponding 
point in the feasible region of Fig 2.1. 
The above discussion can be summarized in the form of following result. 









Result 2.3.3 Every corner point of the feasible set S is a basic feasible solution of the 
system Ax = b,x > 0, and conversely every basic feasible solution of the above system is 


ry + ry 7 ` A Li > t S 
VOLT, OT LI € Jeu, %0 e sé i 
al = : PP Tgi Á $ r 
4 A ka 
4 TO e ETEN, ME A do ES t 
z z > > n na CIT Og nia TT 


C orrespondence between basic 
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Ta 
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strate with the help of following system 


point. This we ilh 
X1 + Xo X = 3 
xX) —xX2 + x4 = 0 
Xi, X2, X3, X4 2 Ü. 


Here 
x=  col(x1,X2,%3,%4) 
b= col(3,0) 
tat ke Q 
ane nett -10 ait 


1 @ 
--Also, Rank (A) = 2(< 4). Let us now take two basis matrices namely By = p A and 


B2 = li ‘ and find the corresponding basic solutions. 


Ie © i 
(i) By = 0-1} For this basis matrix x3 and x4 are basic variables, and x; and x2 are 


nonbasic variables. Therefore taking x, = x2 = 0 and solving the resulting system for — 


x3 and x4, we get x3 = 3; x4 = 0, thereby getting the basic solution as (x = 0,%2 = 
0,x3 = 3,x4 = 0). As all the basic variables are non-negative, this basic solution is a 
b.f.s. Further, as the basic variable x4 = 0, this is a degenerate b.f.s. 


$ ee : . t 
(ii) B2 = a il For this basis matrix x2 and x3 are basic variables, so taking xı = x4 = 


0, we get x2 = 0,x3 = 3, thereby getting the basic solution as (x; = 0,x2 = 0,x3 = 
3,x4 = 0). It is a b-f.s as all the basic variables are non-negative. Further it is a 
degenerate b.f.s as the basic variable x2 = 0. 


The above two degenerate b.f.s are different as B4 and Bz are different basis matrices, 
but they both corresponds to the corner point (0,0,3,0) in R* or (0,0) in R2. 
The above discussion motivates us to state the following result. 


Result 2.3.4 Every corner point of the feasible region S is a basic feasible solution of 
the system Ax = b, x > 0; and conversely every basic feasible solution of the above system 
is a corner point of the set S. Further for the non-degenerate basic feasible solutions 
the correspondence is one to one, i.e. two distinct non-degenerate basic feasible volini 
e i to two distinct corner points of S, but for the degenerate basic feasible 
n ~ all than one degenerate basic feasible solutions may correspond to the same 
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Result 2.3.5 If the given LPP has an optimal solution then at least one basic feasible 
solution is optimal. 


In view of Result 2.3.5, a possible way to solve LPP could be to find all b.f.s (at 
most ”Cm) and choose the one at which the objective function value is maximum. An 
important point to note here is that the main difficulty of the graphical method, namely 
corner points can not be computed and visualized beyond two or three dimensions, is 
no more there. This is because theoretically, all b.f.s can be computed, no matter how 
many variables are there or how many constraints are there. 

Though the above logic seems to convince us that now we have an algorithm which 
is capable of solving every linear programming problem. But one can imagine how much 
time consuming it will be to find all "C,, (in the worst case) b.f.s and then choose the 
one at which the objective function value is maximum; and therefore this can not be 
taken as an appropriate method to solve LPP’s. Nevertheless, corner points (or b.f.s) 
are important and we have to concentrate on them to develop an efficient algorithm 
for solving LPP’s. A more practical approach seems to be the one which does not need 
all b.f.s at one go, rather starts from an initial b.f.s only. Then it verifies (using some 
mathematical and geometrical arguments due to the structure of linearity) if the current 
b.f.s is optimal. If NOT, then in a systematic logical manner it generates a new b.f.s, 
which is an improved one, and this process is continued till a suitable optimality criteria 
is satisfied. This is what precisely the simplex method does. Geometrically it checks, the 
optimality of current b.f.s (i.e. the current corner point) by verifying if the value of the 
objective function at the given corner point is more in comparison with the value of the 

=. ob jective function at all other corner points which are joined by an edge, i.e. all adjacent 
corner points. If this happens, then it will mean that current corner point (equivalently 
the current b.f.s) is a local maximum point, which because of linearity will assert that 

it is a global maximum point. Here it may be noted that even adjacent corner points 
are not known explicitly, so desired information is achieved implicitly. However if the 
current b.f.s is not optimal then the simplex method generates a new b.f.s so that the 
objective function value increases (it does not decrease). In fact it generates such a b.f.s 

by giving a corner point that is adjacent to the current corner point. Thus the simplex 
method moves along the adjacent corner points only, i.e. it moves from the current 
corner point to another corner point joined by an edge so that the objective function 

- value improves. Since the number of corner points are finite (at most ”Cm), the method 
will terminate in finite number of iterations (except in certain rare situations called 
‘cycling’ to be discussed later). We now continue our discussion on the simplex method 

in a somewhat detailed manner. 


E 
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2.4 The Simplex Method: Certain Notations 


We now consider the standard form LPP and with respect to that introduce the following 
notations to be used subsequently 

a) = col(ay;, Pairs - <7 inj) 

A =[a,a®),...,a,...,a] 

B =(bM bO, EN) 

xe = Bb = col(xp1, xp0,.. tare) | 

y) = col(y1j, Y2j,---,Yij,-. - Yj] = Bolg) (j = 2N) 

cB =col(cg,,Ccp,,..., OB) a <iGB.,.); 
cp, being the coefficient of basic variable Xg; (i = 1,2,...,m) in the objective function, 
Z(Xp) = ChXB = Eii CB;XB;; and Zj = chy — ae n). 

It is not difficult to understand what above notations really mean. We know that 
A is an (m x n) coefficient matrix and therefore if by a0 we denote the j" column 
G= OEA then we can write A = [a a, ...,a,...,a™]. Also as Rank(A) = 
m(< n), there exist m linearly independent columns in A which we have denoted by 
bD p2) .,b™) and called them as basic columns. Here it may be noted that these are 
not new columns but come from the matrix A itself, i.e. b js nothing but one of the 
columns of A, and similarly other columns pb) ..,0™ as well. Since we do not know 
the exact indices of these basic columns in general, we have used a different notation 
for these. Further B is an (m X m) matrix consisting of these m basic columns, it is 
a basis matrix and so invertible. The (m x 1) vector xg = Bop gives the values of m 
basic variables XB1,XB2,---,XBm. Again this is not a new vect 
components of x which correspond to basic variables. If by xr we denote the vector of 
non-basic variables then (p= Bp) XR = 0) is the basic solution for the basis matrix B. 
In case all components of the vector B~!b are non-negative, it is a basic feasible solution. 
As non basic variables are always zero, we shall call xp = BD itself a basic solution or 
a basic feasible solution, as the case may be. The m basic columns of A form a basis 
for R” and so any column a) of A can be written as a linear co 
basic columns i.e, q(/) = yrjb1 + yojbo +.. 
Thus the m elements of the vector y) 
when a nonbasic column 40) is expressed as a linear combination 
As cp denotes the coefficient of m basic variables in the ob jective function and Xr = 0, 


or; its components are m 





Zp) BxB gives the value of the ob jective function for th 
(zj — cj) with Zj = Cay” are called relative cost coefficients (or dual 


ri va variables) and their 
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Max z = 4x; + 3x2 
subject to 
Xi +X <s 8 
2x1 + x2 < 10 
X1,X2 2 0. 


After adding the slack and surplus variables we get 
Max z = 4x, + 3x2 + Ox3 + 0x4 
subject to 

X1 +X2+%3=8 
2X1 + X2 + x4 = 10 
AO ASNA RZ 0s 
Therefore 


x= col(x1, X2, X3, x4) 
b= col(8,10) 


and Rank (A) = 2(< 4). 
0 


Now if we take B = f Af as the starting basis matrix then B = B} = I and so 


XB Ipsa- Bb a ialla 
y® = = Ba = larll = (c 
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basic feasible i.e. all xp; > 0,(i = 1,2,..., m). 

Step 2 Check if the b.f.s at hand is optimal. If yes, then we stop. 

Step 3 Check if the given LPP has unbounded solution. If yes, then we again stop. 
Step 4 If the current b.f.s is not ‘optimal and there is no indications of unbounded 
solution then generate another b.f.s xg such that z(xg) 2 z(xp). 

Step 5 Continue above steps till we obtain an optimal solution or there is an indication 
of unbounded solution. 

Now we discuss each of the above steps one by one. Let us first discuss how to obtain 
initial b.f.s. For this we note that if all constraints are with ‘<’ sign then this step is 
trivial. We simply get hold of m slack columns to have an (m x m) identity matrix as 
starting basis matrix and obtain the corresponding initial b.f.s xg = B- “h = Lb UAW 
In the case of mixed constraints, we use the method of artificial variables for obtaining 
the initial b.f.s which we plan to discuss later. This latter method will also verify if the 
given LPP is feasible or infeasible. 

We shall now assume that we have a starting b.f.s, say xg = Btb. For this b.f.s we 
form the simplex tableau as described earlier. Then we have the following important 
results whose proofs we shall have in the next chapter. These results are stated with 
regard to the current simplex tableau and they help us in performing Steps 2, 3 and 4 
as described above. 


Result 2.5.1 If all (z; — cj) = 0 then the current b.f.s is optimal 


Result 2.5.2 If some (zj — cj) < 0 and corresponding to that all y;; < 0 then the given 
LPP has unbounded solution. 


Result 2.5.3 If some (zj — cj) < 0 and for that some yij > 0 then there exists a new 
b.f.s xp such that z(xp) = z(xB). 


So once the given LPP has finite optimum and the current b.f.s xg is not optimal, 
Result 2.5.3 above guarantees the existence of a new b.f.s xg such that z(xg) > z(xg). In 
the absence of degeneracy z(Xp) > z(xg) holds. 

As per the development of the simplex method, the new b.f.s xg is obtained by 
taking one column out of B and entering another column of A which is not already 
a basic column and thereby getting the new basis matrix B such that z(xg) > z(xp). 
Geometrically this means moving from the current corner point along an edge to get an 
improved adjacent corner point. 

The obvious question here is how to decide which column a) of A should be entered 
in B and which column b” of B should be taken out. For this we should concentrate on 
c three basic goals. These are (i) the new matrix B is a basis matrix so that the new 
a Xp is a basi Gii gomtion sy every component of xg is non-negative so that it is 

«s and (iii) the objective f ion poui improves (at posi it remains at the same 
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Initial b.f.s 


Compute all (z T ci) 
os yes 
all (z: — c:)>0? Optimal “Stop 
YS! Unbounded 
Solution 


Z- praan (Z a G) 





(k) (r) 


Enter a and take b “Out. 
Get new basis matrix B and repeat 





Fig. 2.6. 


Example 2.5.1 Use the simplex method to solve the following LPP and identify the 


movements graphically 
Maz Z = 4X1 + 3X2 
subject to 


Xi +X. <8 
2X1 +X2 < 10 
X1,X2 = 0. 









Solution bia poe already solved this LPP graphically and obtained its optimal solution 
F: = A gt ite poke vee z* = 26. Let us now solve this problem by 
ethod E st express it in the standard form by adding 

‘his gi ‘ives the following 
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xp | yD y y A 
-1/2 


0 





This new b.f.s corresponds to the corner point (5,0) with improved value z = 20. 

We again use Result 2.5.1 and conclude that this new b.f.s xg is still not optimal. 
Also using Result 2.5.3 we note that column a) enters the basis and x2 becomes a 
basic variable. Further the minimum ratio criterion gives that column for the variable 


x3 \e leaves the basis and so x3 becomes a nonbasic variable. Thus the next basis matrix 
1 


is B= = a . We again evaluate B- and compute all the relevant entries to get the 
following 


XB | y) y” y) y” 





Now all (2; — cj) 2 0 and so Result 2.5.1 papers that the current b.f.s is optimal. ‘Thus 
p the given LPP, an optimal solution is xf = 2, x* = 6 and the maximum value is 

= 26. Also this b.f.s (xf = 2,xž = 6,x3 =0,xj = 0) corresponds to the corner point 
ae which is an optimal corner moi ~ the initial b.f.s corresponds to the corner 
point (0,0), the above simplex tableaus tell that we move from corner point (0,0) to 
(5,0) and then to (2,6) which is the optimal corner point. This we illustrate in Fig. 2.7. 





Q (26), z=26 
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If we now have a second look at the above solution procedure to solve Example 
2.5.1, we note that there is something which is still not very satisfactory, because to 
get the new elements in the simplex tableau we are first computing the inverse of the 
new basis matrix B explicitly, and then using the same to get xg, y”, Zj etc. The obvious 


question is that can we do something better so that we do not compute (B)! explic- 
itly. The answer is yes and that leads us to what is called pivoting in the simplex methog 


Pivoting In the Simplex Method 


In the simplex method, we use pivoting to obtain new entries in the simplex tableay 
(corresponding to the new basis matrix B) without finding (B)! explicitly. The basic 
steps in the pivoting are as follows 

Step 1 Identify the pivot column. This corresponds to the column for the entering vari- 
able in the tableau. 

Step 2 Identify the pivot row. This corresponds to the row for the leaving variable in 
the tableau as obtained by the minimum ratio criteria. 

Step 3 Identify the pivot element. This is the element, which is common to both pivot 
column and pivot row. Let this be denoted by Ypg, where in the simplex tableau p" TOW 


is the pivot row and the g" column is the pivot column. 
Step 4 


(i) Divide all entries in the pivot row by the pivot element, i.e. 


Ypj = a (i= 0) 1,2,...,n), where Ypo = XBp. 
Pq 
This will give Yp = 1. 
(ii) Take all remaining entries in the pivot column as zer 
variable zj —c; = 0 and y") is an identity column) 


(iii) For the remaining entries of the tableau, use the update formula 


o. (This is because for a basic 


= YpjYiq . 
Yij = Vij - KOSH =1,2,...,m,m+1 and fe 01,2, <.....1: 
pq 
Here we are identifying yj = XB; (i =1,2,...,m), Ym+10 = 
so that the same updation formula remains 
tableau. 


z(xg) and Ym41,; = (Zj—cj), 
valid for all the remaining entries of the 


Let us now revisit Example 2.5.1 and consider the initial simplex tableau 


y” y”? 
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— 
d 


for the initial basis matrix B = : 4 As discussed in Example 2.5.1 the column a") 


enters the basis (so that in the new tableau x; becomes a basic variable) and column 
b® = a leaves the basis (so that in the new tableau x4 becomes a nonbasic variable) 


and therefore the new basis matrix is B = : i Now we identify pivot column, pivot 


row and pivot element in the above tableau as indicated and perform Step 4 of the 
pivoting. This gives 


4 
y) y” y) y! ) 





Here the second row is obtained by dividing the pivot row by the pivot element Yp, = 2, 
and then taking other entries in the corresponding column as zero. This gives the second 
row and the second column in the new tableau. For the remaining elements we have 
to use the updation formula as described in step 4. For example the new value of x3, 
namely x3 is obtained as x3 = 8 — te =i — 5 = 3: 

Similarly, if Z and Z2 — c2 respectively denote the new values of z and Z2 — c2, then 


z=) = 0 - AM = 20 


= 


Zp — Cp = —3 — ——_ = -3 +2 = -1 ete. 


In a similar fe oe using pivoting in the new tableau so obtained, we get the fol- 


xp | yy? y ya 


lowing tableau 





which gives the optimal solution xf = 2, x; = 6 and the optimal value z* = 26 as 
obtained already. We shall give proper mathematical justification of pivoting in the 
next chapter where we shall prove that pivoting will always give the same tableau as 
the one we would have obtained by finding (B B-) explicitly. 
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constraints of the given LPP are mixed type, we shall in general, not have a (m x m) 
identity matrix to start with. As we have already seen the advantage of starting with 
a (m X m) identity matrix (virtually no computation is required for the initial simplex 
tableau) we are tempted to introduce (artificially or as per force) as many identity 
columns as are required to have an (m x m) identity matrix. Since the given LPP is in 
the standard form, all constraints are already with ‘=’ sign, no new variables can be 
added at a positive level. But unless a new variable is added in a constraint there is 
no possibility of introducing an identity column at that place. So we introduce these 
variables ‘artificially’, i.e. wherever identity column is missing say i” constraint, we 
introduce a variable Xa; and call that as the i” artificial variable. Here it should be 
noted that if the constraints of the given LPP are consistent, i.e. the given LPP jg 
feasible, the variable Xa; Can not take positive value. Therefore, we take appropriate 
precautions to make these artificial variables zero as early as possible. Further as long 
as some Xa; > 0, geometrically speaking we are not occupying a feasible corner and are 
somewhere outside the feasible region. It is only when all Xa; = 0 then a feasible corner 
is obtained. If in some problem it is not possible to make all x4; = 0 eventually then that 
LPP should be infeasible or equivalently the constraints are inconsistent. 

At this stage we may think that why can’t we start with, a non-identity matrix. 
Theoretically we can certainly do it but for any meaningful large problem (never think 
that we are interested in LPP’s of 2 or 3 variables only-there may be hundreds or 
thousands of variables) it is almost an impossible task because we may have to check 
all possible "C,, combinations and then check that we really get a basis matrix which 
gives rise to a b.f.s. 

There are two popular methods which require the introduction of artificial variables 


to give an initial b.f.s. These are called the two phase method and the big-M method. 
We now explain each of these through examples only. 


The Two Phase Method 


We shall explain the two phase method with the help of the following LPP 


Min Zz = 2x} + Xo 
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Max z = —2x; —X2 + 0x3 + 0x4 





subject to 
3x, +%2 =3 
Ax, + 3x2 — x3 = 6 
xı + 2x2 + X4 =3 


X1, X2, X3, X4 2 0. (2.7) 
SP eae O | 
Here the matrix A=|4 3 -1 0| does not have a (3 x 3) identity matrix. Infact 
le 2 ae). ock 


we are missing the columns (1,0,0) and (0,1,0) as the column (0,0,1) is already present 
due to the slack variables x4. So we add two artificial variables Xa, and Xa, (both 2 0) 
and construct the following Phase I problem 


Max 2, = —Xaji mw Xo 
subject to 
OX] + X2 + Xa, = 3 
Ax, + 3x2 — X3 + Xa, = 6 
Xy + 2X2 + X4 = 3 
X41, %2,X3,X4,Xay,Xa, 2 Ù: (2.8) 


Here we note that the coefficient matrix of the Phase-I problem (2.8) has desired (3 x3) 
identity matrix so starting solution for solving the Phase-I problem by the simplex 
method is readily available. 

Since all x,; > 0, the objective function value of the Phase-I problem is < 0. Hence 
if constraints of the given LPP are consistent the optimal value of the Phase-I problem 
must be zero. 

Thus after solving the Phase-I problem by the usual simplex method, there are 
two possibilities which may arise. The first possibility is that the optimal value of the 
Phase-I objective function is zero or equivalently in the optimal solution of the Phase-I 
= problem, all artificial variables are appearing at zero level. In such a situation the given 
_ LPP is feasible and an initial b.f.s has been obtained. The second possibility is that it 
does not happen, i.e. the optimal value of the Phase-I objective function is strictly less 
than zero or equivalently in the optimal so lution of the Phase-I problem some artificial 

bl s strictly positive. If this happens then it is an indication that the feasible 
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Here the optimal value of the Phase-I problem is zero and hence the constraints of the a 


original problem are consistent. Further as all the artificial variables are nonbasic, this 
gives the starting b.f.s for the original problem as x; = 3/5, x2 = 6/5, x4 = 0. Now we 
go to Phase-II for finding an optimal solution of the given problem (What happens ïf 
instead of X,,, we take x4 as the leaving variable? See Example 2.8.1 for the case when 
some artificial variables are present in the basis at the zero level). 


The Phase-II Problem 
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Since all (z; — c;) > 0, so the Phase-II problem has been solved. Therefore for the given 
problem, (xf = 3/5, x} = 6/5) is optimal with the optimal value as z* = —(-12/5) = 
12/5. 


The Big-M Method 


This method is similar to the two phase method except that rather than having two 
separate problems for Phase-I and Phase-II, here we have a combined problem. After 
converting the given LPP in the standard form, here again, we introduce appropriate 
number of artificial variables x, in the appropriate constraints so as to get an (m x m) 
identity matrix to be taken as an initial basis matrix. But in this approach, the artificial 
variables are assigned a very large negative cost in the objective function. The simplex 
method, while trying to improve the objective function, will find the artificial variables 
uneconomical to maintain as basic variables with the positive value. Hence they will be 
quickly replaced in the basis by the real (actual) variables with smaller costs. For hand 
calculations it is not necessary to assign a specific cost value to the artificial variables. 
The general approach is to take the cost of artificial variables in objective function 
as —M, M being large positive number. However, if the problem is to be solved on a 
machine then the value of M is to be specified. It is customary to take the value of M 
as 100xmax|c jl- 

Let aak the below given problem by the Big-M method. 

Max z=3%x1—%2-—X3 
subject to 
x, — 2x2 + x3 < 11 
—4x, +x +23 > 3 


—2x,+x3= 1 
X4,X2,X3 > 0. (2.9) 
As in the two phase method we first convert the given LPP in the standard form to 


get 
Max z= 3x; -x2 — X% + 0x4 + 0x5 
subject to 






xy — 2x2 + 2X3 +X =11 
—4x, +X%2+2%3-XxX5= 3 
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the second and third constraints respectively. It is obvious that once the LPP is jn the 
standard form then the artificial variables will be introduced in those constraints on] 
which are originally with ‘>’ or ‘=’ sign, because these will not give identity columns. 
Only those constraints where the slack variables are introduced will give rise to ident; 
columns. For the problem at hand, we need to solve the following problem by the Big. 
M method. Here as explained, we take the combined objective function by taking the 


objective function as given and attaching a very large negative cost to each artificia] 
variable. 


Max z= 3x1 — x2 — x3 + 0x4 + 0x5 — MXa, — MXa, 
subject to 
Xi — 2x2 + x3 + x4 = 11 
—4x, + X2 + 2x3 — X5 t Xa = 3 
hH Na F Xa = A 
NA XS, XA X57 Xar Xa Z O. (2.11) 


Now taking x4, Xa, Xa, as the initial basic variables and solving the problem by the 
simplex method we obtain the following simplex tableaus 


xp y) y2 y yA y yD ya 
x4=11 1 =) ers.) 0 0 
Xa =3 —4 1 2 O i 1 0 
= a E = 0 ORREN 0 1 
MI (-3+6M) (1-M) (1-3M) 0 M 0 0 
T 
XB y” y” y®) y) y^) y®) y2) 
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xB oy) yA y y4 =) yi) yl) 















0 15 227% 2/3 -5/3 
x2 =1 eerie Oo. Oi wea 1 -2 
0 1 2/8 eae 4/3 —7/3 

2 v UV "0 1/5 Vs Wis) 


(M - 2/3) 


At this stage all (zj—c;) > 0 and so an optimal solution has been obtained. Therefore 
oa — N3 SEY X = 9 is an optimal solution and z* = 2 is the optimal value. 

We now take few more examples and solve them by the two phase method. It will 
be useful if readers solve these by the Big-M method as well. 


Example 2.6.1 Solve the following problem by the simplex method and verify your 
answer graphically 


Maz z = 4x, + 3x2 
subject to 
x+% <8 
2X1 + X2 = 10 
Xi, Xo 2 0. 
Solution It is obvious that we need only one artificial variable x,, and therefore the 
Phase-I problem is 
Max Za = —Xa, 
subject to 
xı +xX%2+%3=8 
2x1 + x2 — X4 + Xa, = 10 
L17 X27 XB, X47 Xa, ZN: 


Now columns for the variables x3 and Xa, give identity column and therefore taking these 
as basic variables we have the following simplex tableaus 


XB | y” y” y®) y) y) 





ly, UE | y) y2 yë y® ors Pt 
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Phase-I problem has been solved. As the value of 
the given LPP is feasible, i.e. the constraints ay, 
nonbasic variables and so we go to Phase-]j 


At this stage all (z; — cj) 2 0 and so 
the Phase-I objective function is zero, 
consistent. Also all artificial variables are 


directly. The Phase-II problem is 
Max z = 4x1 + 3x2 
subject to 
x1 +X2+%3 =8 
2% + x2 = %4 = 10 
Ny, Xd, X3,X%4 Z 0. 
Here, from the last tableau of Phase-I we drop the artificial columns and reevaluate 


the elements of the last row with respect to the new cg (in our example cg = (0,4)!) to 
get the initial tableau of the Phase-II problem. This gives the following 


xg | yO ya y y® 
OF Pp el 
iL iy 0 





XB y” y2) y®) y® 





Therefore the optimal solution is Xa = lo Xa = 0, and z* = 32 is the optimal value 

i Ai- sa solve the given problem graphically. The feasible region is depicted a Fig 
-6 with the corner points A : (2,6), B : (5,0), C : (8, 0) respectively. A] A). = 26 

2(B) = 20, and z(C) = 32. Therefore, the corner point C, namel mo = pee 

optimal solution and z* = 32 is the optimal value. This k th ee 

two phase method. eanne ea 
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2x,+x,=10 Xt X,= 8 


Fig. 2.8. 


Example 2.6.2 Solve the following problem by the simplex method and verify your 
answer graphically 


Maz z = 4x1 + 3x2 
subject to 
X14 +X2 28 
2X1 + X2 > 10 
X1,X2 = 0. 


Solution Here we need two artificial variables xa, and xa, and the Phase-I problem is 


Max Za = E Xu Xap 
subject to 
xy +X2—%3 +X, =8 
2X4 + X2—-%X4+Xq, = 10 
Xi 2 Xa hay War Os 


Now in the initial b.f.s x,, and Xa are the basic variables. This gives the following 
simplex tableaus 
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I problem is zero, the given LPP 


As the optimal objective function value of the Phase- 
drop them and 


is feasible. Also both artificial variables are nonbasic variables and so we 
go to Phase-II as explained. The Phase-II problem is 


Max z = 4x1 + 3x2 + 0x3 + 0x4 
subject to 
Xp +%2-x3=8 
2x1 + x2 — x4 = 10 
X1,X2,X3,xX4 = O. 


Now, the initial tableau of the Phase-II problem is 


XB yO) E A 





Here the last row has been written w.r.t the objective function of the Phase- 
by taking cg = (3,4)!. e Phase-II problem 


After pivoting we get the next tableau as 


XB y) y) y®) y) 
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l ipai solve the above problem graphically. The feasible region is as depicted in Fig 
2.9 and the arrow indicates the direction of increase of the given objective function. It 


clearly indicates that the given LPP has unbounded solution which matches with what 
has been indicated by the simplex method. 





. 
` 
$ 
` 


2 xX;+ x,=10 


` Xt x =8 


Fig. 2.9. 


Example 2.6.3 Solve the following problem by the simpler method and verify your 
answer graphically 


Maz Z = 4x; + 3x2 
subject to 
X1+%x%2<8 
5x1 + 6x2 = 60 
X1,X2 Z 0. 


Solution Here we need only one artificial variable xa, and therefore the Phase-I problem 
is 


Max Za = —Xay 
subject to 

Xa Xe + X3 = 8 
5x1 + 6x2 — X4 A = 60 





T2013 Xa, 20. 
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1 
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Now, as all (zj — cj) 2 0, the Phase-I problem has been solved. But here the optimal q 

value of the Phase-I objective function is not zero and so the given LPP is infeasible, 

This is also indicated by the fact that in the optimal solution of the Phase-I problem, 

Xa, is still present at the positive level namely, Xa, = 12, thereby making the original 

constraint inconsistent. ] 
We now solve the given problem graphically. The feasible region is depicted in Fig. 

2.10. As the two lines x, + x2 = 8, 5x1 + 6x2 = 60 do not intersect in the first quadrant, 

there is no point in RÊ satisfying all the constraints and so the set of feasible solutions 

is an empty set. Thus the given LPP is infeasible or equivalently the constraints are 

inconsistent. 
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optimal solutions. In the second scenario, i.e. when there are infinitely many optimal 
solutions, we say that the given LPP has alternative optima. Now a glance at Fig. 2.5 
suggests that not all alternative optimal solutions are corner points (b.f.s). In fact in 
this figure, the set of all optimal solutions is the line segment joining the two optimal 
corner points. In general, it can be shown that set of all optimal solutions is the convex 
set spanned by the optimal corner points, i.e. it is the conver hull of the optimal corner 
points. 

If the given LPP is such that it can be solved by the graphical method, then certainly 
all optimal corner points can be identified geometrically and hence the relevant convex 
hull, i.e. the set of all optimal solutions can also be viewed geometrically. The obvious 
question here is to know if the simplex method can also do the same. The answer is 
‘yes’. From the optimal simplex tableau, using the below given result, we can infer if the 
given LPP has alternative optima or not. Further, if there is indication that alternative 
optima do exist then we can also determine all optimal basic feasible solutions. 


Result 2.7.1 In the optimal simplex tableau of the given LPP, if for some non-basic 
variable xj, Z;—c; = 0 and for that some yij > 0 then the problem has alternative optima. 


We now take following example to explain some of the points discussed above. 


Example 2.7.1 Use the simplex method to solve the following problem and verify your 
results graphically 


Maz z= x% +X% 
subject to 
xı +X. <8 
2X1 + xX2 < 10 
iyo 0. 


Solution Here we need two slack variables to get the following standard form LPP © 
) Max z = X1 +X + 0x3 + Ox4 
subject to 






X1 +X2+%3=8 
2x1 +xX2+x4 = 10 
X1,X2,X3,X4 2 0. 


Now in the initial b.f.s, x3 and x4 are basic variables. This gives the following simplex 
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So an optimal solution of the given LPP is Gr =P XA = 6) and the optimal value jg 
z* = 8. Also Result 2.7.1 is applicable indicating that the given problem has alternative 
optima. Since the hypothesis of Result 2.7.1 holds for the nonbasic variable X4, there 
exists another optimal b.f.s in which x4 is a basic variable. To find this optimal b.fs 
we make x4 a basic variable in the next iteration and as per the minimum ratio criteria 
make x; a nonbasic variable to get the following tableau 

y® y” y®) y® 


0 


Thus we obtain another optimal b.f.s given by ( 
If we now solve the given problem by grap 

(0, 8) as two optimal corner 

obtained by 


we shall prove later, the set of 
is a convex set and therefore if the given LPP has more 


2.8 Redundancy in Linear Programming 


While applying t e simplex method - : i 
a ae od we have emphasized that the given LPP must be 


: 
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Fig. 2.11. 


Max B= 
subject to 
Ax =b 
X 20 (2.12) 


with (i) b > 0 and (ii) Rank A = m(< n). Sometimes, it may happen that Rank A may 
be less than m, i.e. rows of A are linearly dependent or equivalently there is redundancy 
in the constraints of (2.12). In this section we wish to understand this scenario and show 
that the simplex method itself can be used to detect the same. 

Let us now go back to the two phase method discussed in Section 2.6 and note that 
if the given LPP is feasible then at the end of Phase I, the optimal value of the Phase-I 
objective function is certainly zero. This implies that in the optimal solution of the 
Phase-I problem all artificial variables Xa; take the zero value. But a variable x, may 
take the zero value in two ways. Either it is a nonbasic variable (hence it is zero) or it 
is a basic variable at the zero level. If in the optimal Phase-I tableau all xa; take the 
zero value as nonbasic variables then we drop the corresponding columns from the last 
Phase-I tableau and go to Phase-II directly as explained in Section 2.6. 

However sometime it may happen that although the optimal value of the Phase-I 
objective function is zero but some artificial variable is present in the basis at the zero 
level. In such a situation we can not delete the corresponding artificial column for going 
to Phase-II, because then we shall be short of one basic variable. So what should we 
do in these weiiuations. It is obvious that we can not go to Phase-II unless we have a 
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Example 2.8.1 Use the simplex method to solve the following 
Max z = —2x1 — X2 + 0x3 + 0x4 
subject to 
3x1 + X2 = 3 
4x, + 3X2 — x3 = 6 
Xi + 2%. +x%4=3 
X1, X2, X3, X4 2 0. 
Solution The above example is the same as the one discussed in Section 2.6. So we 
have the following Phase-I problem 


Max Za = —Xq1 — Xa? 
subject to 
3x1 +%X2+X%q = 3 
4x1 + 3x2 — X3 + Xa = 6 
XP 2X tA = 3 


X1, X2, X3, X4, Xal, Xa2 = 0 


and the corresponding simplex tableaus are 


XB y” y) y! 1) y) yi) y) 





















XB yd y yd yO yd yA 















for the variable x4. Theoreti- 
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Therefore, (xf = 3/2,x} = 6/5) is an optimal solution of the given LPP with the optima 
value z* = -12/5. 


Example 2.8.2 Use the simpler method to solve the following 


Maz Z = x] — 2X2 + 3x3 
subject to 
X1 +X. +%x%3=6 
=O Xo + 2k5 = 4 
2X2 + xe = 10 
X1,X2,x3 > 0. 


Solution We first write the given LPP in the standard form, i.e. 
Max Z = X1 — 2x2 + 3x3 + Ox, 
subject to 

Xi +X2+%x%3=6 

metic ot 2X5 — 4 

2x2 + 3x3 = 10 
Bt XA =i 7 

X1, X2, X3, X4 > 0. 





£ J Jai 

Bee A = it 2D 0 
eA z= 2 3 0 and Rank A=3 where as the number of constraints, i.e. ^ 

6 0 1 4 


e by noting that the sun 
in the following we use the simple 


ruct the Phase-] problem 
Za = —Xal — Xa? — Xg3 
subject to 
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and get the following last tableau for the sane as 


1 2 43 
y” y) y®) y® ye” ye” y! ) 










Vp E o, 
e245 0 Ott 2e AL/2- Zee 
0 0 =i m P 


1 0 0 









As at the end of Phase-I, the objective function value is zero, the given LPP is 
feasible. But before going to Phase-II, we have to exchange %g3 as it is zero as a basic 
variable. For this in the row for the variable x,3 in the tableau, we have to check if the 
Yij value is nonzero for some genuine (i.e. not an artificial variable) nonbasic variable. 
If no such value exists, then there is redundancy in the LPP. In the above tableau, the 
only variable which can be exchanged with xa3 is x4 but for that the corresponding Yj; 
value is zero. So here exchange is NOT possible and so the given LPP has redundancy. 

Since x, is the artificial variable which is zero as a basic variable and that can not 
be exchanged, the original constraint where x2 has been added, is redundant. Therefore 
to solve the given LPP we should first drop that redundant constraint and then solve 
the problem. Equivalently, we drop the third row (row for x42) from the above tableau 
and then go to Phase-II to get the following tableau 


xp | yD y2 y ya 





Therefore, (x* = 2,x3 = 2, xž = 2) is an optimal solution of the given LPP with the 
optimal value z* = 4. 


2.9 Degeneracy and Cycling 


In the context of the equivalence between basic feasible solutions and corner points 
we have observed that two non-degenerate b.f.s. will correspond to two distinct corner 
points but more than one degenerate b.f.s. may correspond to the same corner point. 
= Therefore, in n the peste of degeneracy, the simplex method will terminate in finite 
: iterations ause Pn mere will be strict improvement in the value of 
OTEN successive corner points which will be 
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i F. me that for the negative most value of (z; = c) 
function value is o = a ef pre is certainly a possibility of such a scenario), 
on ee ts minimum ratio criteria the variable xp, willl pire: fea 3 ir 
next tableau. Also the variable x; will become a basic variable A t = at Te 
as yx(> 0). Then using the pivoting rule we get Xp, = Xk = 0 an 2(xB rf ; : is 
shows that the new solution Xp is also degenerate and there is no change in the objective 
function value. Infact both degenerate b.f.s xp and Xg correspond to the ps ie 
point. So though we have performed one iteration algebraically we have not left the 


current corner point at all. , = wei it 
Now what we have assumed for xg can certainly happen for xg so that the new 


b.f.s ip is also degenerate and all three b.f.s, namely, Xp, xB, and Xp correspond to 
the same corner point. Continuing with this argument we infer that there is certainly 
a possibility (though very rare) that we get involve into a sequence of ae ta bai 
TAEA — Bes an) all corresponding to the same corner point, and a ale In this 
situation we shall repeat the same sequence and continue indefinitely. This phenomenon 
is called cycling in the simplex method. 

If for some problem we are trapped in cycling, then the simplex method will never 
terminate because though algebraic iterations are being performed, geometrically, there 
is no movement as all degenerate b.f.s. in the sequence correspond to the same corner 
point. 

Thus except for cycling, the simplex method will always terminate in finite number 
of iterations. Here the readers should appreciate two points. Firstly cycling occurs due 
to degeneracy but not everytime degeneracy will lead to cycling. We need that rare 
coincidence when all degenerate b.f.s. meet the stated conditions so that all correspond 
to the same corner point. Second point to appreciate here is that degeneracy is very 
common in applying the simplex method to real life problems but cycling is very-very 
uncommon. Nevertheless since cycling may occur in the simplex method, ways have 
been devised to augment the simplex method so that cycling is avoided, e.g. Hadley [72] 
but we do not plan to discuss these here. 

We now present an artificially constructed example to illustrate cycling in the simplex 
method. For this let us consider the LPP 

Min z = 3/4x1 — 20x2 + 1/2x3 — 6x4 
subject to 





1/2x1 = 12% E 1/2% + 3x4 < 0 
1/4xy — 8x2 Seat 4 <0) 
X14, XD, X37 X4 > 0. 
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Min z = 3/4x, — 20x2 + 1/2x3 — 6x4 
subject to 
1/2x, — 12x7 —1/2x3 + 3x4 + X5 = 0 
1/4x1 — 8x2 — x3 + 9x4 + X6 =U 
x3 +x7=1 
X41, X2,X3,X4,%5,X6,X7 = O. 





If we solve the above problem by the simplex method then the results of various 


tableaus can be summarized as follows 








































m=04, = 07 =! (0,0,0,0) 
i =A x = 0,47 St (0,0,0,0) 
i — Uy = 0/47 = 1 (0,0,0,0) 
mete = Ox =F (0,0,0,0) 
: = 0% — 07 = (0,0,0,0) 
x4 = 0,x%6 = 0,x7 =1 (0,0,0,0) 
Ne = Oi xe — e al (0,0,0,0) 


Here the b.f.s obtained at the seventh iteration is the same as obtained at the initial 
iteration (i.e. iteration number 1) and so we have a sequence of six degenerate b.f.s 
which are going to cycle and they all correspond to the same corner point (0,0,0,0) in 
R‘ where the objective function value is zero. 


Let us recall the format of the simplex tableau used in earlier sections and call it as 
tableau in the extended form. In these tableaus we store y) columns for both basic as 
well as nonbasic variables. But it is already known that if x) is a basic variable then 
y column is an identity column and the value of (zj — cj) = 0. Therefore storing y?) 
columns and the values of (zj — cj) for basic variables x; is rather unnecessary and can 
possibly be avoided. Keeping this in mind, in this section we introduce a new format 

for the simplex tableau and that we shall call as the tableau in the condensed form. 
The main advantage of using the tableau in condensed form is that we have to store 
= lesser data work with a smaller sized tableau. The only change in the earlier working 
will be with regard to pivoting. The pivoting rules used earlier for the tableau in extended 
reed modification if we are working with the tableau in the condensed form. 
o to illustrate the format of the tableau in the condensed 


: 
: 
2.10 The Simplex Tableau in the Condensed Form 
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Max z= 4x) + 3x2 + 0x3 + 0x4 
subject to 
Xi + Xo +%x3=8 
2x1 +X2+X4 = 10 
X1,X2,X%3,X4 = 0. 


If we are working with the usual tableau (i.e. tableau in the extended form 


initial tableau will look like 


XB y) y” y) 





If we now decide to work wi 


y) 


th the tableau in the condensed form then the initia] tableay 
will look like 


xg lly y2 





which we shall normally write as 
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Step 3 Divide the remaining entries of the pivot column by the negative of the pivot 
element. 


Step 4 For remaining entries of the tableau, follow the same update rule as the one for 
tableau in the extended form. 


Step 5 Exchange the indices of the variable to enter and variable to leave. 

Here we may note that except for Step 5, nothing is done (in Steps 2,3 or 4) for 
indices in the right most column and top most row. As such these are not entries in the 
tableau, they just identify the basic and nonbasic variables at every iteration. 


For our example, after performing the above pivoting steps on the initial tableau in 
the condensed form we get the following 





In the tableau x3 is a basic variable whose value is 3 and x; is a basic variable whose 
value is 5. Also x4 and x2 are now basic variables. As (z2—c2) is still negative, the current 
solution is not optimal. As indicated, now x2 becomes a basic variable and x3 becomes 
a nonbasic variable, and therefore the modified pivoting rules as described here give the 
next tableau as | 





Therefore xf = 2, x5 = 6 is an optimal solution and the optimal value is z* = 20. 


2.11 Summary and Additional Notes 


e Section 2.1 presents the graphical method which provides certain intuitive results for 
the general LPP’s. | 

e Section 2.3 describes the simplex method as an algebraic version of the graphical 

The concept of b.f-s is introduced as an algebraic analogue of the geometric 
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e The simplex algorithm was developed by G.B. Dantzig in 1947 and published at i 


later date in 1949. l 
e There are some excellent texts on linear programming, €-8-, Hadley [72], Murty [117 


Bazaraa et al. [12], and Gass [66]. y> 

e The most classic book on linear programmıng 15 
which is an experience to read. 

e Linear programming is also covered in most of the text book 
In particular Taha [154] and Phillips et al.[122] are excellent references. 

e The book by Charnes and Cooper [35] gives an excellent account of various applica. 


tions of linear programming in management and industry. 


by Dantzig himself, i.e. Dantzig |44) 


s on operations research, 


2.12 Exercises 


2.1 Solve the following LPP’s graphically 


(1) Maz z= 2x; + 4x2 
subject to 
3x1 + 5x2 < 15 
3X4 + 2x2 <i? 
X1,X2 = 0. 


(2) Min z = x, — 10x» 
subject to 
x1 — 5x2 >0 
-Xi 5% <5 
¥1,X2 = 0. 





(3) Max Z = 2x1 + 5x2 
subject to 
i xı + 2x2 < 20 
. xı +X <15 
ESD 





Linear Programming 57 





(5) Maz cae 
subject to 

2x1 +X2 54 

-H + Xo 21 

XY, X2 > 0. 


(6) Max ae 
subject to 

EOS L 

Sit oe ee 

Xiyae = 0. 


(7) Max ee 
subject to 

Xj —-% <1 

2x1 — X2 <6 

Xoo, 0. 


(8) Maz rae 
subject to 
PES 3) 
x, <4 
3.65 (ey 2 0. 


(9) Mar S i 
subject to 

Me = 2 

xy + x2 21 

eS; 

X1,X2 m2 0. 


(10) Max A 
subject to 





ale [xa] < 2 


a = 0. 
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2.3 Consider the problem 
Maz 2 = Min(3x — 10, -5x + 5) 
subject to 
OSASI 


(a) Solve the above problem graphically. 
(b) Formulate the above as a LPP in standard form. 


2.4 Suppose we want to show that all solutions of 


X1 +x: <4 
2x1 — 3x2 < 6 
x1 >0,x2 20 


also satisfy x1 +2x2 < 8. Formulate this problem as a LPP and verify your result grap; 
ically. 


2.5 Use the simpler method to solve all problems given in 2.1 and identify those whic 
have 


1. unique optimal solution 

2. unbounded solution 

3. infinitely many optimal solutions and 
4. no feasible solution. 





Verify your answers, both by the simplex method as well as by the graphical method. 
2.6 Identify all basic feasible solutions for the system 


X1 +4x2 +xX3 =8 
EELA F V = 4 
X1, X2, X3, X4 > 0. 
2.7 Solve the following LPP without using the simplex method 


Max 


Z = 4x; + 5x2 + 11x3 +2x4 
subject to 


21x1 +7x — 3x3 + 10x4 = 210 
Xi z 0,i = 1,2,3,4. 


4 
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2.9 Use the simplex method to check the consistency of the following system 
—6xX, + X2 + X3 55 
—2x1 + 2x2 — 3x3 2 3 
2x2 — 4x3 = 1 





X1,X2,X3 = 0. 
2.10 Solve the following LPP 

Min z = 2x1 — x3 + 28 
subject to 

—X, + X2 + X3 = 4 

—xX, + X2 — X3 Š 6 

x1 <0,x2 20 
x3 unrestricted in sign. 


2.11 Use the simplex method to determine a solution of the following set of linear 


equations 
Ka X = 4 
2x1 +X2= Oe 


2.12 Find all degenerate b.f.s of the system 


x1 +X. +%x3=3 
36) O ar ee 0 
i XT, X2, X3, X4 > 0. 


2.13 Solve the following LPP by the two phase method and illustrate each iteration 


graphically 
Max z = —X1 + 8X2 
subject to 
Xt XD = il 
=| ar 6x2 < 3 
Xi S2 
Ki A 0. 






lution of following LPP without actually solving it 


m: 


= ye She an LS, Le 
X1 — X2 + X3 — X4 
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2.15 Is the following a LPP ? 


Max z= 4x; + 3X2 
subject to (xy +%258 or 2x1 +28 10) 
iy eee 3 


Solve the given problem graphically. 
2.16 Solve the LPP 


Max ZH Xz + 2X2 Psi + NXy 


subject to 
Xp +X t+... +X <1 


n Sl 
X > Oi =e Ti: 








3 
Mathematics of the Simplex Method 


O o 





3.1 Introduction 


While describing the simplex method in the last chapter, we have used many results 
which were guessed purely from the geometry of the linear programming problem. The 


basic aim of this chapter is to prove all these results mathematically so as to complete 
the discussion of the simplex method from theoretical point of view as well. 


3.2 Some Basic Definitions 


In this section we introduce some basic definitions on convex sets and related concepts, 
which are to be used in the subsequent sections. 


Definition 3.2.1 (Convex Set). Let S C R”. The set S is called a convex set if for 
O<A<lL x ueS>Ax+(1-AumEsS. 


Thus a set S C R” is convex if for any two points x,u in S, the whole line segment 
joining x and u is in the set S. In Fig 3.1, the first and third sets are convex in RÊ but 
the second set (the shaded portion) is not convex. 





Fig. 3.1. 






Betton an empty set and a single point set are always considered to be 
convex. Also the intersection of arbitrary many convex sets 1s always a convex set but 
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`e R" Then the smallest conver set contain; 
Definition 3.2.2 (Convers ‘: a Le ‘ 

the given set S is called the convex hull of S and 18 denoted by C nwS 

z ) = § Fig 3.2 a nonconve 

It is obvious that if S is a convex set then Conv(S) , it 1g X Set 


. : 2 
and its convex hull are depicted in R^. 





wae 


S Conv(S) 


Fig. 3.2. 


Definition 3.2.3 (Extreme Point/ Corner Point). Let S C R” be a conver set. A 
point x* of S is called an extreme point or a corner point of S if A x,u (x + u) in S, and 
0<A <1 such that x* = Àx + (1 -= À )u. 


Thus a point x* is an extreme point of S if it does not lie on the line segment 
of any two distinct points of S. We may check that for the set S; = {(x1,X2) : x > 
0,x2 = 0,x1 + X2 < 1}, the extreme points are (0,0), (1,0), and (0,1); whereas for the set 


Sı = {(%1,%2) : x} + x5 < 1}, every point on the circle x? + x2 = 1 is an extreme point. 


Definition 3.2.4 (Hyperplane). Let p C R” and d € R.Then the set H defined as 
H = {x € R” : p'x = d} is called a hyperplane. 





Thus a hyperplane H in R” is the natural extension of line in R2 or a plane in R’. 
Also every hyperplane is a convex set. 


Definition 3.2.5 (Closed Half Spaces). Let H = 


GER on vad lane 
in R”. Then the sets P } be a hyperplan 


Hy = {x €R": p!x d) 
and 

ro Hg = {x ER": px > d} 
are called the closed half spaces generated by the hyperplane H 


We can check that Hı and Hp are convex sets in RE 
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Definition 3.2.7 (Edge). Let S C R” be a convex set, and x,u E€ S with x + u. Then 
the line segment joining x,u is called an edge of the convex set S if it is the intersection 
of S with a supporting hyperplane. 


Definition 3.2.8 (Adjacent Extreme Points). Let S C R” be a conver set. Then 


two extreme points X and tl of S are called adjacent extreme points if they are joined by 
an edge. 


Veunit.on 3.2.9 (Convex Combination). Let xu E€ R”. Then the combination 
Ax +(1-—A)u, 0 < A < 1 is called the convex combination of x and u. In gen- 
eral, let soe LR oe ole be r points in R”. Then the combination ae Ax” with 


Ay 2 0 (k = 1,2,...,r) and X; Àk = 1 is called the convex combination of r points 
D x), 


Here we must note the difference between the linear combination and convex com- 
bination. In linear combination a,x) + ax” +... + Arx] ) a4, Q2, ~- @ ER. But in the 
convex combination the coefficient are non-negative and their sum equals one. 


Definition 3.2.10 (Convex set spanned by a set). Let S € R”. Then the set 
CSpan(S) given by 

C'Span(S HES Ax”, Lig A, = 1, k finite (arbitrary), Àr = 0, and xes, vr} 
is called ihe convex set spanned vy S. 


Thus CSpan(S) is the set of all convex combinations of an arbitrary but finitely many 
elements of S. It is simple to check that CSpan(S) and Conv(S) are same. Also CSpan(S) 
is different from Span(S) as Span(S) is the set of all linear combinations of finitely many 
elements of S, and therefore it is a subspace of R” where as CSpan(S) is a convex set in 
R”. 


Definition 3.2.11 (Polyhedron/Polytope). Let S C R”. Then S is called a polyhe- 
dron if it is the intersection of finite number of closed half spaces, i.e. 


S={xe R": pix < dj (C= le een he 


If a polyhedron is also bounded then it is called a polytope. A polytope is thus a closed, 
bounded, convex set having finitely many extreme points; the bounding surface being 
the hyperplanes. The set RÊ given by Rį = {(x1,%2) € RÊ : x; > 0,x2 > 0} is a polyhedron 
i R? but not a polytope, where as the set S = {(x1,x2) € R? : x1 + x2 < 1,%1 > 0,x2 > 0) 


i] 
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3: | 
A 2-simplex in R? is a solid triangle and 3-simplex in R” 18 a solid tetrahedron (the 


> lso being included). 
‘ote inside the triangle/ tetrahedron are a | 
er <a be er eis: every point of the polytope can be expressed as a convey 


combination of its extreme points. Therefore if S is a polytope js: paagi 
xD x2). x then for any x € S, there exist scalars G1, %2,+++, Ak such that a > 0 (rs 


Dee adr = 54, an. 


3.3 Some Elementary Results for LPP 


We consider the LPP 


Max Z = CX + Cok +... + CnXn 
subject to 
A11X1 + 412X2 +... t AlnXn (<, =,2 ) by. 


A21 X1 FA22X%o FE ... 1 AgnXn (<, = Z ) bz 


Ami X1 + aAm2X%2 +... F AmnXn E =, 2 ) by 
26) 2 Onooapden > 0; (3.1) 
where as explained in the last chapter, only one inequality sign holds in each constraint, 


though different constraints may have different inequality sign. We denote by S, the 
feasible region of LPP (3.1). 


Result 3.3.1 The feasible region S is a conver subset of R”. 





Proof. Let 5; denote the set of all points x € R” for which the i constraint ofLPP (31) 
holds (i = 1,2,...,m,m+1,...,m+n). Then each Si is either a hyperplane or one of the 
closed halfspaces, and hence a convex set. But S = N;S; is the int 


poe ersection of finitely 
many convex sets, consequently it is a convex set. 


o 


Result 3.3.2 Let LPP (3.1 ) has 

Result 3.3. ; an optimal solution x*. aj 

interior of the feasible region S. gopi o 
Brook og x* i5 optimal to problem (3.1), we have cT 

geur cS. T ien by the definition of interior po 


TQ neo 
Å E ed 
T 


E E Ge aie possible 
int there exists a neighborhood 






a <n 






(3.2) 
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cx = can n S) 
2 |Icll 


= cr ye € Ilc]? 
Eak € 
=O 2" + 5 ilell ; (3.3) 


Therefore (3.3) gives (c’x — cix*) = silcl| > 0, which contradicts that x* is optimal 
to (3.1). Hence x* ¢ ints. o 


Result 3.3.3 If the given LPP has an optimal solution then at least one corner point 
of S is optimal. 


Proof. Although the proof holds in much more generality, we shall give the proof for 
the case when S is a polytope. In this case, it is guaranteed that the given LPP has 
an optimal solution, so that we have to only show that the optimal value is certainly 
attained at an extreme point. 

Let x* be an optimal point of the given LPP. If x* is an extreme point then the result 
holds obviously. So we consider the case when x* is not an extreme point. Then, since 
S is a polytope it has finitely many corner points, say xf) x2,...,x®, and any point of 
S can be expressed as a convex combination of the corner points of S. In particular this 
holds for x* as well, i.e. there exist scalars a O ET pon such that 


k k 
oe Gaye a = 1, af ONT (3.4) 
r= 


Therefore, 

cTx* = cf (xy nie x) ie. ae (c7x), 
ie. c'x* is the weighted arithmetic mean (with weights a?) of k scalars cx, Hence by 
the property of the arithmetic mean, 


k 
cI x* = Y; ak (cx) 


niii L 
ex, <. ¥,c’ x) 


= 
J 
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* = clx”), which proves the result. This is because if 
re exists an extreme point xP) which is also Optima] 
ertainly optimal. ` 
) is also a global optimal point, 


Equations (3.5) and (3.6) give Cg 
x* is not an extreme point then the 
Thus at least one extreme point is € 
Result 3.3.4 Every local optimal point of LPP (9:1 
(3.1). Then by i- definition of local may 
point, there exists a neighborhood Ns(x) such that ex 2 cx for oe ; ne NS. 
where S is the feasible region of the given LPP. (We have taken N(x) , a ecause 
the ô- neighborhood of x is to be considered in the relative topology of S). : et u be 
any arbitrary point outside Ne) 0S: The result will be proved if we can show that 
cx > clu. | | 

Now we refer to the below given figure (Fig 3.3) and note that there certainly exists 
a point ê € Ns(x) NS and 0<A < 1 such that x = Au + (1 — A)x. 


Proof. Let ¥ be a local max point of LPP 





S 
No5(x) 
u 
Pigi a-a: 
Therefore 
ag 6 (n+ (1 — A)x) 
= A(elu) + (= Ay(cTa) (3.7) 


But as x is a local max point and ĉ € No(x) N S, we have cT 


X > cl RZ. 
(3.7) gives > C x. Therefore equation 


Alcu) + (1—A)(cTX) = clR < c17, 

i.e. A(eTu) = (cl) — (1 - A\(cT, 
which because of 0 < A <` i i Alcu) = ACCO TY, 
ae i $ fi A < 1, gives clu = c17. Hence 7 is a global max point. o 
Result 3.3.5 The set ; 


g R T, e R pre ee i 4 
fe Set of all optimal solu 


Ta ens Of a LPP is a convex set. 
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| timal solution then the result 
ue given LEP has at least two optim? 





UIMA!l SOlitianc af +h. o T 
E sofutions of the given LPP. i 
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i V = {xeS:xis optimal], 
where S is the feasible region of the given LPP. We shall prove that V is a convex set. 
Let x) and x® € V. Let = Ax® 4 (1 -—A)x™,0 < A < 1. Then ĉ € S because 
x) € S, x® € S and S is a convex set. Also cl x) > cTx for all x € S and c?'x® > cx for 
all x € S. Hence {€ V, which gives that V is a convex set. As a consequence of Result 


3.3.5 we get that if a LPP has more than one optimal solution, then it has infinitely 
many optimal solutions. o 


Remark 3.3.1 Though the above results have been proved for the LPP (3.1) which is in 
the maximization form, these results hold for LPP’s in the minimization form as well. 


In view of the above discussions, we observe that because of the structure of linearity 
on LPP’s, the below given properties are guaranteed. Further, only because of these 


properties, we succeeded in developing a method like the simplex method to solve LPP's. 
These properties are 


| (P1) The feasible region of LPP is always a convex set. Infact it is polyhedron/polytope. 
(P2) If the given LPP has an optimal solution then at least one corner point (extreme 
point) of the feasible region is optimal. 


(P3) For LPP’s, every local max point is a global max point. Also every local min point 
is a global min point. 


Thus properties (P1), (P2) and (P3), are very basic to the algorithmic study of 
LPP’s and the main reason for having them is the presence of the structure of linearity. 
For nonlinear programming problems the structure of linearity is missing and therefore, 
in general, we can not guarantee these properties and that makes NLP’s much more 
difficult to solve. Following examples illustrate these points. 


Example 3.3.1 Consider the optimization problem 


Maz z = xı + 3x2 
subject to 

Xj +X2<3 

X1X2 <1 

X1,X2 = 0. 









= and check if its feasible region is a conver set. 


= 





timization problem is a NLP as the constraint x1x2 — 1 < 0 is a 


ee 


qt > 
y h ə 
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Fig. 3.4. 


Example 3.3.2 Consider the optimization problem 


Min z = (x; —3)* + (x2 — 3) 
subject to 

Xi +X <4 

xı — X2 <2 

iea 2 O 


Is the optimal point a corner point? 





Solution The given optimization problem is a (NLP) as the objective function is a 
nonlinear function of decision variables x; and xz. Here the feasible region is a polytope 
with corner points O, A, B, and C; and the optimal solution of the given problem is the 
point P, which is not a corner point as depicted in Fig. 3.5. 

Therefore there is no guarantee that the property (P2) holds for NLP’s. Infact if in 
the above example we change the objective function to Z = (x1 —2)* + (x2 — 1), then the 
optimal point (2,1) lies in the interior of the feasible region S. | 


Example 3.3.3 Consider the optimization problem of minimizing the function f(x) = 
(l= 10)cos(2mx) over [-10, 10]. Is a local min point also global min point? 







E S] ra: Oy. a In mabe e ene X FA 
Solution | he function ea = 
‘cm RE a Yh Nle A 





—— ( y= 0 f "( S | PZ T° Ny Ji ; 2 . . 
ver [10.10] is bien i, Dae eae SY? JS called one dimensional wave function: 


function has many local min points but 
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Fig. 3.5. 


-4 -2 0 2 4 6 8 10 
Fig. 3.6. 







y other difficult global optimization benchmark problems available in 
) nd 5 always a challenge to obtain the global min point because most of 

ear DrogT: amy ning a £ vorit N ms giy q ve only the local min point. We shall 
| : lobal c pt: cit on alg: OT ith: ms in the e le er chapters: Though we 


a tos 
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e exploited for studying such NLP’s. W 


ity is there and that can b l 
Se cid icn ming problems later 1m the book. 







shall have a detailed discussion of convex progran point of 
u ESVE 
the parti 
3.4 The Simplex Algorithm: Main Theorems ze 
Let us consider the LPP in the standard form, 1.6. 
Max E 
subject to 
Ay = But Au 
This to 
ae 0, (3.8) u = BD; 
with (i) b > 0 and (ii) Rank A = m(< n). Here x € R", c € R”, b eR” and A = [qj] Pe aoa 
is an (mxn) matrix. Also S = {x € R” : Ax = b,x 2 0} is the feasible region of LPP i n 
(3.8). eee 
We now recall our discussion of the simplex algorithm in the last chapter and note ah 
that the most basic concept introduced there is the concept of the basic feasible solution Bas ae 
xp for the given basis matrix B. We can write A = [B: R] and accordingly partition ij ae 
the vector x as x = col(xg, xr). Then the system Ax = b can be written as rhe jir 
Howev 
; XB} _ 
[B: R] a = p p 
, colum 
| LEs Ke ae b, hence 
LC) pi) Died — Beep, (3.9) of line 
Equation (3.9) is the well known result which states that if Rank l pig? 
system Ax = b will have infini it Rank A = m(<n) then the level | 
In ill have infinitely many solutions depending upon (n — m) parameters Ne 
Bol we choose xr = 0 then (3.9) gives xg = Bb and this solution, namely, (xg = possi 
b, xr = 0), we have called as the basic sol ti ies 
ite = ute i solution for the basis matrix B. Further 
FE we call the solution as the basic feasible solution for the basis matrix 
: We also recall other notations introduced in the last chapters l Q(j= 
1,...,n), 2(xg) and j= ly. .., 7). , MEY ear 
We now have the followin j 
Ta, + dug g main theorems, which have b 
for the development of the simplex che een used in the last chapter Also 
ae aly ee A ry extreme point of th : 
An \() and conversely every b.f 7 9 foe : is a O.f-s to the system of equations 
: ~~ E OJ the above system is an extreme point of the 
7 Ne wish to prove that x is a 
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point of S. Then, by the definition of an extreme point, this implies that there exist 
vES,v ji S, u#v,0O<A<1suchthat x = Au+(1-A)v. Now partitioning u and v as per 
the partition of x, we have u = col(ug, ugr) and v = col(vg,vp). Then x = Au+ (1-A) 


JAGR XB Up UR 
1ves = o 
g 0 AN) ec nfa) 
lê. XB = Aup + (1 a A)vp (3.10) 
0 = Aug +(1—A)up (3.11) 


But Au = b, u>0; Av = b, v>0. Also 0 <A < 1, and hence (3.11) gives ur = 0 = OR. 
This together with Au = b and Av = b gives ug = B-!b and vg = Bb. Thus 
u = v, which contradicts that u and v are distinct. Therefore x is an extreme point of 
>. 

(ii) Next let x* = col(x1,Xx2,...,X») be an extreme point of S. We shall prove that x" 
is a b.f.s to the system Ax = b, x > 0. For this, it is enough to show that columns a0) 
of A corresponding to non-zero components of x* are linearly independent. Let k be the 
number of nonzero components of x*. Then without any loss of generality we can assume 
hat x = COG a x5 pa oe 0,0...,0), and then show that columns a) gq?) ... a® are 
linearly independent. Here we should note that k < m, as Rank (A) = m. Fork = m, 
the linear independence of a, a®),. ..,a will give that x* is a nondegenerate b.f.s. 
However for k < m, the columns a) qi) .. _,a will not form a basis even if they are 
linearly independent. But then we can always augment (m — k) more columns such that 
columns a, a®@,...,a“ together with these (m—k) columns are linearly independent and 
hence form a basis (we may recollect that in a finite dimensional vector space every set 
of linearly independent vectors can be extended to form a basis). Now the components 
of x* corresponding to these (m —k) augmented columns are basic variables at the zero 
level and therefore x* becomes a degenerate b.f.s. 

Now we proceed to prove that columns a"), a®,...,a are linearly independent. If 
possible, let these be linearly dependent, so that exist scalars A; (not all zero) such that 


)Aiai = 0. (3.12) 
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: (2) = x* — eA. Then x > 
1 = collAy, An AmA 0,+.+,0)'E IR", x) = x* + eA, and ¥ = xX" —€ M 
A = COAN Ma Ak Me Ca 15 1 (3 13) Ax =f Ax) = b. Therefore x F 
and x® > 0. Also because of (3.12) and (3.13), A: | , S 
and x® e S. But x* = (x!) + x))/2, which contradicts that x™ is an ex re ' point of 
S. Hence columns a) a, ...,a” are linearly independent as desired. This prove that 
x* is a b.fis of the system Ax = b,x 2 0. D 
4 Let us now recall the simplex tableau 


on Y 
henc 
expl 


i.e. 


But 





and associated results which have been used in the stepwise description of the simplex 
method. We shall prove these results now. 


Theorem 3.4.2 If some zj—c; < 0 and for that j some yij > 0, then there exists a new 
b.f.s. Xg such that z(Xp) 2 z(xp). 


Proof. Let the hypotheses of the theorem hold for the current b.f.s. xg corresponding wh 
to the basis matrix B = [b®,b®,..., bD, bO, p+), .._ b0]. Let B be changed to Ê, 
where 













Ê = [b®,b®,..., BD aD peed, p] 


i.e. from B, column b” has been taken out and at that place column a) has been 
entered. So far we have not put any conditions on j and r but obviously there must be 
some conditions on j and r such that Ê gives a b.f.s. £g such that z(%p) > z(xg). Then to 
prove the theorem, we have to show that the conditions on j and r as osen here, shall 
be met under the hypotheses of the theorem. 

Now the first thing we require is that columns of B are linearly independent because 
only then £g will be a basic solution. Next we wish that every component Xp, of Xp is 


non-negative so that it is a b.f.s.; and finally this fp sh 
We now consider all of these one by one. pe smowe be such that x(n) = z 


Since a) is a non- 


1.€ 


qi) = yyjb +...+ y,jb Lo Ymjb™ 


then by th 
: ; e repla rem 
j F U. (Let V be a vector space of dim ide vi aer oe 
E 3 ieee O EACE OF ension n wi i 
We Nave t X10] + QV 4 ; E a y e E a E , the basis 88.04, 02,... sUn- For 
— - nak oy TL EAF IHY A a! e 4 j 12 ; R Aa 
ia “ey di 1e Replacement Theorem states : 


— A 2.78) a C 
Hod- 
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on r and j is that Yrj + 0 and this gives that columns of B are 
hence it is a basis matrix which 


explicitly. For this we note that 


linearly independent and 
gives the basic solution £g. Now we attempt to find &p 
Xs = B™'b, ie. Bxp = b or in terms of components. 





m 
iid xp b” = b, (3.15) 
: i= 
Le, 
m 
A xp,b xp b” = b. 
1=1, 
Lx T 
But as y,j # 0, (3.14) gives 
(ta : 
DO eee) „hO (3.16) 
Yri L fe 
t=, 
tE Tr 
which on substitution in (3.15) gives 
m x m i 
y xpd) + = Jal) >. yijb| = b. (3.17) 
| Yr; a 
1 = ip = 1l, 
IA i+ r 
i.e. 
3 | TRHA) yo Bea = b. (3.18) 
XB; 
| Yr, Yr; 
de i=1, 
its IT 


' , =z, a y 
rie hl . j: 

; a F a-g. ES tae ER Dq P 9 f 

TOT OTA tau 1 ues, fta Y O à e » as 

iore 11 we Gelue x LEOR 
4 k ~ O A a a af 
s T jni p È 
a ee 








.. | 
. zaj 
TAN Bt Yaf: 


x a L na Taa P 


: ; 





. n ‘oo i cm = n sre G T 
a. pi S eee N SERA ENNE. $ p 
t YN CO PLAMTNAN COC oO : = : 
1 whose com ponents — i l 
cae z Aai 4 7 cane a 
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se = G í 
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(for that we needed condition on fr and it 
we have now to put additional conditions 
For this we need that all components of 


Having obtained the basic solution n 
that y; # 0) for the new basic matrix DB, 


; | comes basic feasible. 
so that êp becomes basic | | Jéra 
te ra be non-negative. Now looking at equation (3.19), we note that for Xp, to be 


i 5 ~ = ey j 
negative we need Yr; > 0. For i + r, component fp; 18 ried non-negative for 
non-negative we tices’ Jr mn Yii us we wan 
those i for which yj; < 0. So we have to bother only when Yij 7 0, Bau ant that 


for yjj > 0 as well 

j XB, >0 

eB, Ut e y 
y 





rj 
1.€, T xp, ak 
Yij  Yrj 
l.e. 
XR, 
oes in} Yij > o}. (3.20) 
Yrj i | Vij 


The relation (3.20) is called the minimum ratio criteria and it guarantees that irre- 
spective of j, as long as y+; > 0, if we chose r as per (3.20) then the next solution £g is 
certainly basic feasible. 

Next we wish to put condition on j (note that condition on r has already been fixed 
now as per equation (3.20)) so that z(£g) > z(xg). For this we note that z(xg) = CA XB = 


m 
Loi — 1 CB,Xp, and 


m 
z(XB) F X ĉs;âs, 
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Here, in (3.21), Cp, = Cj as the entering column is a) and for that the corresponding 
basic variable is x; whose coefficient in the ob jective function is cj. For i # r, the basic 
columns have not changed and so Ch = Cg Also, including i = r in the summation 
amounts to adding a ‘zero’ value in (3.21) and hence there is no change. 

Now if we chose our j such that (z;— cj) < 0 then from (3.22), z(%g) = z(xg). Thus we 
must have some j for which (z; — cj) < 0 and for that some yj; > 0. Then we can chose r 


as from (3.20) and get a new b.f.s. Xp with z(ŝg) > z(xg). This proves the theorem. OU 


Theorem 3.4.3 /f all (z; - cj) > 0 then the current b.f.s. xg is optimal to the given 
“EP. 


Proof. Let x € S be arbitrary point. We have to show that under the condition 


(zj — Cj) 20, V j, z(xp) = 2(x), where Z(%p) = C Xp and z(x) = cx. For this we start 
with zj —c; = 0, Y j, and as x; > 0, V j, we have 


ZjiXj Z CjXj, \/ if 


Now summing over j, we get 


LE. 
n m 
T 
È ei Xj 2C X. 
j =] vrsi ' 
But on LHS these are finite sums and so we can interchange the order and get 


m n 
$ csl) |z cin. (3.23) 
1 =1 j anl 


From (3.23) we observe that the theorem will be proved if we can show that 


vill be cfxg which equals z(xp). 


n q 3 
ii Y = j p” 
AA U Cc 


nd nence 
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But (3.25) is a vector equality and therefore we have to equate componentwise. The jth 
ied i ó Th apak n e yfi „x; as desi 

component on LHS is yiti + Yn t + Yi, Xn Therefore Xp, 2 j= 1 Yijžj i 


Theorem 3.4.4 If some (zj -— cj) < 0 and for that j all yij £ 0, then the given LPP has 


unbounded solution. 


Proof. Let us choose that j for which the hypotheses of the theorem hold, i.e. (z;—c;) < 0 


and yj; < 0 for all i=1,...,m. Now for the current b.f.s. Xp, 


m 
Y x30 = b. (3.26) 


i=] 


If we choose @ € R arbitrary then (3.26) can be rewritten as 


m 
‘a xp.b® + Oa) — Oa = b. (3.27) 
i=l 


But a is a non-basic column and therefore 


m 
AD = a yijd® . (3.28) 
p— 1 
Substitution for a from (3.28) into (3.27) we get 


m m 
Y bO + 0a- ayyb) = b, 





1=1 1=1 
Le. 
m 
| Ye, — 6y;)b” + 0a = b. (3.29) 
: i=l 


If we now define a vector £ as £ = col(X1,...,Xm,Xm+1,0,0,...,0) where 
Pape OY (1 = 1)... Jin) ana knw =. (3.30) 
Then (3.29) gives 









AX =D. 


If we further ch oose O > 0, then x 2 0 as yj < 0, V i. Therefore using the condition 
yi; < 0, Vi, and knowing 


a j 


the amran tava xn A ee ey : 
he current b.f.s. xg, we hay e been able to construct a feasible 
2. Here we may note that £ is feasible but NOT basic feasible 
ae i 


| at Ia; es an 3 i h +} A 
| COHN An A aien = nam O xz ] a 
i oS A iis. LiadtitlieLlLyv hye la 
mns, name Ly U 1, U? 
pi s me 
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we have not used the conditio as < ; 
sohr w Taetion for $ i.e. a (zj Cj) < 0. For this, let us find the value of 


m 
t=] 
m 
E Cp (Xp, = Oyij) + Oc, 
ta] 
m 


2: CB;XB; — a)" CB Yij — Cj), 
f=! 


i=] 


z(%) = 2(xp) — O(z;-c;) . (3.31) 


Since (zj — Cj) < 0 and 0 > 0 is arbitrary, equation (3.31) tells that the objective 
function value z(%) can be made arbitrary large by choosing 0 > 0 arbitrary large. This 
proves that the given LPP has unbounded solution. o 


One important point to be noted in the above proof is that it not only shows that the 
pe LPP has unbounded solution but also constructs the feasible point £ for which the 
rbitrary large chosen value z(%) of the objective function will be attained. The following 
m le is illustrative in this context. 
Example 3.4.1 Use the simplex method to verify that the following LPP has unbounded 











` AN subject to 









xX, -X%2 <2 
—3x,+X2 <4 


a aa : ‘ipa 0. 
bea feas gic solution for which the objective function takes dhe malkuak: 
and x4 and solving the problem as usual by 


in Q 





iS 
4 mn onan 
O 3 Pa | OF 


Va r ia D L es a ER 
r t h le ve WV us i 
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3 (4) 
—= Ly n fh E e 
x3 = 6 ~ 0 1 | 
xX, = 4 of leet ae S i) 
“tall «lke 0 3 


: i 
Now Theorem 3.4.4 is applicable which confirms that the given LPI has ye 
solution. Next we have to find a feasible solution £ for which the rhe is: unctio 
attains the value 496. In terms of our notations z(k) = 496, z(xg) = 12 and (z1 = ĉja 


—11. Therefore using the relation (3.31) we get 
496 = 12 = 6(-11) 
LG 
0 = 484/11 = 44. 
Now we use (3.30) to construct £. For this we note that here x3 and xz are basic 
variables and j = 1. Hence 
= 44 
= 4- 44(—3) 
6 — 44(—2) 
= 0 


ac 


136 
94 


a 
ANO Fj * 


R 
Il 
Il 


* 


s? 
| 


is the desired feasible solution. 


Corollary 3.4.1. If all (z; — c;) > 0 and there exists a j such that Yij < O for all 
1 ; 1,...,m, then the given LPP has unbounded feasible region but bounded optimal 
solution. 


Proof. As (z; —cj) = 0 for all j, by Theorem 3.4.3, the cur i 
s oS 4.9), rent solut 0 
the given LPP has bounded optimal solution. , lon. xg is Opa 


The feasible region i 
Aah rs region 1s unbounded follows from the construction of £ in the proot 


Theorem 3.4.5 Ifin the optimal si 


c; = 0 and for that some 
solutions. 


i mplex tableau, for some non-basic variable Xj, Zj7 
Yij > 0, then the given LPP has infinitely many optimal 





2 i . 
ve one optimal soluti 
ve one optimal soluti 
: "q 5 ] i he i 
| ++ rer > ee a Fe we) es Tp: 3 P 


an wa ARA more optimal solution and then u% 





Be 


. iune theorem holds As 
Lm holds. As some y;; > 0, we 





AR ee > © 
ava ta i: 


f isi ae Re ene 3 variable to leave as per the 
> teasible solution fg which will > 


Ct thay 


he 
=? 4 t 
- ` - 
i 
è i 


+ 
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— od r 
z(28) = z(xg)- y EITO), 


and (z; — cj) = 0. This proves the theorem. o 


Theorem 3.4.6 LetB = [b®, b2... „bD bO, b] be the current b.f.s. Let column 
to enter a”) and column to leave bl”) be determined as per the criteria of Theorem 3.4.2. 
Then for the basis matriz Ê = [DM bD... WD a), bm], the new b.f.s. 2p, new 
objective function value z(%z), new vector 9 and new values of (2, — ck) are given by 





XB, Vij 
XB; o -i (i = = r) 
(i) îs = XB, k (i 
oe T 
Yrj J 2 
~fi) 2B) = z(xg)-— p E = Çj) 
ik F (i + r) 
(iii) Vix =. Yrk i : 
—, CSN) 


Yrj 


Proof. Here we must note that proving of above four relations is equivalent to saying 
that pivoting is justified. We have already proved (as part of the proof of Theorem 3.4.2) 
the first two relations and therefore we shall prove relations (iii) and (iv) only. 

Let us go back to the proof of Theorem (3.4.2) and observe that 


1 m 
Irj XL 
it 


m 

a® = By® = Ar yab? + yb”, 
ide 

Ba: 







A 


— ; zs | +1 : . 
Ji f r w F Ta ea 
; si: es) ni Se = fe 
I- a ey Py _ Ln 
y i MEG a 
i Dre í t. 
re ’ i 
J ; P a i 
4 è F 5 wa â P 
Le - —s bad 
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m Vis . 
1 =F (yu o ta, 
Yrj Yrj 


m 
= Y gab? + Ina = Bg”; 
tel, 


og 


which proves relation (iii). A 
Next. let us consider the expression for (2, — cx). By definition 








m 
r= Ch) = y CBiVik — Ck 
i=1 
= X cn (va Re i) r oj ~ &k 
i Yrj Yrj 
il 
Les if 
m 
YrkYij k 
= ` CBi [vs -2 5) a oj 7 
i=1 Irj Irj 
= pa CBiYik — a = < È CBiYij — sj 7 
i=1 Hi) E 
i.e. (k — ck) = (Zk — Ck) — rac — cj), which proves result (iv). 0 
rj 





3.5 A Useful Observation 


Theorems 3.4.2 and 3.4.6 could have also been proved in an alternate way. This i$ 
because the current basis matrix B = [b“), POPES bY) bO prt) b] and the next 
basis matrix Ê = [b®, bA.. Oa), pt) o., , b0] differs Caly Be ie column. Since 

in the current tableau the inverse of the current basis matrix, namely B7}, is knon we 
could possibly use this to get B-1. If this updation of the wen of the bāsis matrix could 
be done in a simple and efficient manner then we could possibly implement the simple% 
per ea a different manner where rather than having y” columns in the tableau, we 

be sai xe m i i at Mats will not only reduce the size of the tableau to be handelle 
P iSO ALLOW ren mae ASRGA be So. ved because the data (c,b and A) could b? 
mplementation of the sim plex method is available in thé 


a 









Ai ALOEIV SITET 
c U! i f: J UER i 





ê 
Ne a ri sa 
" sat A f .T. Sd ay - pf l 
TILIILIPTT ww J 3A 
TLPLEL T Net, loa 
A A UE ID ama Oa 7 
ia s 
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We sepia = revised simplex method in the next section where a relationship 
between B~! and (B)-! will be utilized. 


Relationship between B™ and (8)-1, 


Let, as before B = [b9 bS... 00-0, BO pe, pm], yy) = Bg” and B = 
[b®, Hn, AN Orao, get), aon Siy ptm), Also let P= (e1, €2,+++,lr-1, y”, r417. em) be an 
(mxm) matrix, where e1 = col(1,0,0,... 0) ea = COO Oa ix, O) = COO) 
and y” = col(y1j, Y2)» ---, Ymj). Then 


BF = [BY b),..., 6°), By® DD B] 
= bH, b®,... BD aD per). yey 
= B. 
Hence 
(B) = (BF)! = F-1B7 = EB". (3.32) 
But the matrix E(= F`!) can be computed very easily provided Yri + 0. It can be 
verified that 
E = [e1, €2, IO Cr—-1, &; Cr+1, eis ie Peni 
where 


(Spa co {= Bz! PB di ed ae on. meh | 
mi a — Ya rj Yi Yrj 


So E is ‘almost’ an identity matrix except that the r column is the vector €. Once (B)~! 
is known, we can compute all entries in the tableau easily. For example 


(Ê) b 
EB-!b = E(B™tb) = Exp 
[e1, €2,-++,€r-1y C Gril = Cnn, 


XB 








XB WEE r 
EEEE 2 7) 


PU AS va > 
. sist mack Tà 





Uri ha 
PE. 





Efa 2/4, gaa 
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i 1-42 |! , 
Example 3.5.1 Noting that the matrices B = k l and I = lo | differ only by on 


column, obtain B™. 
Solution We have to use the relation (3.32) which in the context of the example becomes 


ex 
B~! = EI, where E = (e1,é) = : E | as B and I differ only by one column at the 


Om 
position of second column. Next we have to find the vector ¢. For this we need to 


= compute the vector y® for the given a® = 


eb TEH 


As B and I differ at the position of second column, in terms of our notation r = 2i 
Therefore € = col(—2/4, 1/4) = col(—1/2, 1/4), which gives 


a A =y 0]. S 
OAO 1) o A 


3.6 The Revised Simplex Method 


i} This gives 





In the last section, while deriving the updating formulae (Ê)! = EB- we noted that we 
p < ee car ae somewhat differently where we Stirs the elements 
the simplex Aid ot i e Taia i vets | bre ee 
: i ethod. So theoreti 
ee a a ee method are exactly same -it is aa 
= S of much smaller size because B-! is a matri 
n rix of order (m x m) 


The main advanta i : 
ge of the revised simp] 
of much smaller size. Also this is useful a ponp is that here we handle a tableau 










a 1 be evaluated first so that (2 oe that the simplex method 
lex met BA ey. “Í ©) can be computed. But 
a st Compute all (z; — cj) 


- 
p 
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the updation formulae (B)?! = EB-1 ig simple and efficient, the revised simplex method 
implementation is certainly attractive. 

The revised simplex method has another advantage over the simplex method. In 
most of economic applications we need to know the shadow prices (dual variables), 
which (as we shall see in the sequel) are available in the revised simplex tableau itself 
and no additional computational effort is needed to get them. Also certain efficient 
decomposition, schemes for solving structured (but large) LPP’s do employ the revised 
simplex method for their implementation. 

Here we discuss the implementation of the revised simplex method for the case when 
artificial variables are not required and m slack variables themselves give the (m x m) 
identity basis matrix to start with. This case we term as standard form I. The case of 
standard form II, where some artificial variables are required is illustrated by an exam- 
ple only. | 


Revised Simplex Method For Standard Form I 


In this form, the m slack variables themselves give the (m x m) identity basis matrix to 
start with and hence the given LPP has the following form 


: Max Z = C1X1 + C2X2 +... + CnXn 
subject to 
411X1 + 412X2 + ... + A1nXn S bı 
a21X1 + 422X2 +...+ A2nXn < b2 (3.33) 


Ami X1 + Am2X2 +... + AmnXn S apes 
X1 > 0. Z 0; 


where b; > 0 (i = 1,..., m). | 
Now, in contrast to the simplex method, in the revised simplex method we make 


the objective function also as a part of the constraints so that problem (3.33) can be 


written as 
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Max Z fr 
subject to 


O, 
" —0xXn42 + -< - + OXn+m 
z= (c1X1 +CoXo +... +t Cah) OXn+1 n+ 2 by 


11X1 + 412%2 +... +AinXnt Xn+1 ni 
A21X1 +A22X2 +... + AgnXnt OXn4+1 tXn+2 


woe T Xn+m = Dm 


| + 
Am1%1 + Am2X2 Ea ete AmnXnt OXn+1 > 0. 


X1,+° .,Xn+m 


Here z represents the objective function and therefore can have any Sign. Our aim jg 


to find a solution of (3.34) where z is as large as possible. Now the m contra of 
(3.34) constitute a system of (m + 1) linear equations in (n+m +1) unknowns and this 


is expressed as 


Z 0 

Xi bı 

x2 | = | 1, (3.35) 
Xn+m bm 





1.e. 





i 4 É 5 i (3.36) 


where c! = (-cy,-cz,...,—Cn,0,...,0), and 


Aimee G12) rns) a 
OGD os 2 Bore 10 


=. © 
=) 


Gra am o Amm O 0 65.4. 1 


Let us define ay) 









, $ =C; s 
COl(—Cj, AN jr- -er Am) = Fl Gj = 1,...,n + m) and b® = 





MIENEN 0 
KO p a p) = l where the-subscri j 
i n. te oS § bscript /superscript R refers to the revised simplex 


oe iy a*. E A CG 
IAC TH ho N h DA OC A l R. 5 ee 34 SEN 
a3 LU DECE A D > XT!) ee pee 
“29 VE a DaSIC Variable a 


ule always because it represents the 


erelo re the first column of the basie — 


=i 


. Se mye + a! J z ` 
( TT O thy TrTAm™ - reat 5 z y 
— aie remaining m columns of Br (we 
1) re + S ! 
I 


m 
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note that Br is an (m +1) x (m + 1) basi 


(G) 
from columns ag depending upon which 
matrix B for the system Ax = b, Henc 


S matrix for the system (3.35)) have to come 


m columns of A currently constitute the basis 
e Br has the following form 





"CB ~= CB 06. = CBm 


eats 


0 
Br = | 
B 
0 
Ì.e. 
B E =c] 
Br. = i P. (3.37) 
Now using the partition method of finding inverse, we get 
Et. 1 ciB-1 
Ber = F Rol | (3.38) 


where (m + 1) columns of B, are denoted as ej, B1,B2,...,Bm, e1 being the identity 


column, i.e. e} = col(1,0,. 0). Thus Bee = [e1, Bi, ae ade en Now in analogy with the 
simplex method we acini 


YR 
= lo oe |i 


i + Bo | 


B- q\)) (3.39) 


=i EG Tyo 
yO 


K — P] 
y”) i 


rve that the : first C O mp 0) ni ent o ot 









ra 
o 





ie 











gives the value of Ge cj) 
the usual y CO. th mn w hich a S in the simplex 
ted here 1s tha Zj F — cj) i is computed in 
we first cophutg y? 






simplex method 








ac 4 HK ~ + eS EAT, N Py a r 
i l a e > <a ~ m. 


> vector N r 
IN Ak 3 Le. 
ni . ~- tL Ps 
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$ (j) 
(z; =c) = (First row of Bp’). (column ap ) (3.49) 
Also 
q 1 cI BIO} _ z(xB)| 
we = Bob = lo p-1 1B a | XB | (3.41) 


i.e. the first component of the vector oi gives the current value of the objective function 


(R) sive the current b.f.s xp. 
inverse of the initial basis matrix for 


and the remaining m components of x, 
As for the standard form-I, initially B = I so 
the revised simplex method is 





The jata a? (j = Vi 
R (J =1,...,n) and b®) is stored separately and access to a particular column 


í W . O e compute (z ; f j 


other y% is needed. Once y® ; 
- Once y is known, we fi i 
T R EA , we lind a variable to lea i 
p> Mie ratio criteria. Now we need to update the ý AF Pa aa pe 
new basis inverse, namely (By)-1 current basis inverse (B7!) to 
, y (Br) nie this we use the relationshi : 
<= -A p 
as derived in the last section, is ast g He 


a eral ees S E= [e1,€2,.. 
6 SART if 






. 4 Cr-}, £, Cr+1, ceey em], 








Scanned by CamScanner 





a 


Once (Br)! is known the new revised simplex tableau is known and then we continue 
till all (zj - Cj) 2 0 or there is an indication of unbounded solution. 
We now illustrate the working of the revised simplex method. 
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- 





Example 3.6.1 Solve the following LPP by the revised simplex method | 
ý 

subject to } 

3x1 + 4x2 <6 | 

6x, +X% <3 | 

X1, X2 > 0. (3.42) | 

| 

Solution Problem (3.42) is clearly in the standard form-I because the slack variables | 
x3 and x4 will give a (2 x 2) identity matrix to startwith. So we rewrite the problem as 
Max Z | 

subject to | 

z — 2x1 —X2 — 0x3 — 0x4 = 0 

3x1 + 4x2 + X3 = 6 | 

6x1 + X2 + X4 = 3 | 

X1,X2,X3,X4 2 O. (3.43) | 

Now as explained, the initial revised simplex tableau for problem (3.43) is | 


















Variable in the basis 


(z) (x3) (x4) 
Z 1 40), TO N O 
X3 0 ii 0 Be 
x4 0 70 1/3 i6 
T 


Step 1 Given the initial simplex tableau, we first compute (z;— cy) for nonbasic columns 
jaa m P ia E » are nonnegative then the current solution a is optimal, otherwise | 
Te (Zk — Cx). For our example | 


h i . kd 
~ `$ (Z i} > > ka TF meh ogg ee ro ae 
4 5 é 
> E 


First iteration 








Scanned by CamScanner 





88 Numerical Optimization with Applications 


- 2 
(22-2) = (first row x Bz) a 
= (1,0, 0) 4);= -1, 


af 
giving the negative most value as —2 for k = 1. Thus the variable x; becomes a bag, 
variable. | W 
Step 2 Once negative most value (Zę — cy) is identified, we compute the vector Yr only, 


(1) 
by using the relation ys = par. For our example k = 1 and so we compute yp" only 
to get 
ie 20, OV =2 =% 
(1) = 1 0 3 = 3 / 
Yp = |0 
$ OF Os FLING 6 


which we augment with the current tableau as shown. al 
Step 3 Next we find a variable to leave the basis by employing the usual minimum ratio 
criteria. In this context we note that z will never be considered to leave the basis as jt 
represents the objective function which we wish to maximize. 

In our example we evaluate min (6/3,3/6) which corresponds to the variable X4 and 
therefore the variable x4 leaves the basis. 
Step 4 Now we update the current inverse Be to get the new inverse (Br)~! which is 
given by EB". In our example, as x4 is becoming a basic variable and x4 is becoming 
nonbasic, we have 


E = [e1, €2, el; 








where 
€ = col(1/3,—1/2,1/6). 
Thus 
110178 
E SNO 1 -1/2 
00 1/6 
and 
I @ L/S WN 20 a0 1 
Pay. 0 t/3 
(Br) ie ED 21 uo he 1 a 
2 0 HE V0 TO 
Re 1 ET a 1/6 
Ta | iio 
Parat ii: 0 1 
eo! = 0 1 a Na 
a Ds EO) a /6 | 13] pine./2 









Dg eee + 7 PA m- JE asis 
Poy f ta ie = 0 by eee Fa Fai et = j 
Ci & wd ol] a SIC J3 Yat e 1 
Aw. 5 he) IVO: (o Kk : 
a a S SE and the next b 
l ; er riai ae 
Te ere. ; : 
’ j 7 3 TE Pr x oa 


— TPO: | 
TIrycy’ “a, t: F 7m 
ESO CN Troni |) 3>» al 
Eeee VA CHITENnf, Tterati sie’ 

SS e s m ESA V. . O r y 
' USES D l i ) 

y 7 . f : e- 
T a f bal + 

ore S a 

Ai AAS Lor Ke + Jk 

i i , ) 
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(Z) (x3) (x1) 
Z 
X3 


0 
Step 1 We have (z4 — c4) = (1,0,1/3) o SS 
1 












Second iteration 


all 
(Z2 — C2) = aon[ = 2/2, 
1 


therefore x2 becomes a basic variable. 


1 0 1/3][-1] [-2/3 
Step 2 We find y = |0 1- -1/2|| 4] = | 7/2 |. 


OF 0 Leonie 1/6 


Step 3 As Min (9/ 2/ T ANY 2/ 1/6) = Min(9/7,3) occurs for the variable x3 it becomes a 
nonbasic variable. 


Step 4 Now E = [e1 ; fa ers 
a keno) A CO) _ 1/21 
é = col Gene 7’ 7/2 | col(4/21, IRL, / ), 
1 4/21 OL OF aS 1 4/21 5/21 
(Br)! = EBS: = i 2/7 i 1  —1/2| = |0 2/7 —1/7 


o a2. MO O6 0 2 A 


il 4/21 5/21110 13/7 
ra — (BRR = i Paap “2 s = MIZ 
| 0 =1/21 s28 DT. 
Therefore the next revised simplex tableau is 


“4/21 


„5 
i 
d £ 
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Third iteration 0 


Step 1 We have (23 ~¢5) = (1,4/21,5/21) : = 4/21 and 


0 
(za —c4) = (1,4/21,5/21) = 5/21. 


As both of these are non-negative, the current solution is Op Hana, PaaS, the opti. 
mal value z* of the given LPP is 13/7 and the optimal solution 1s (xT = 9/7, xy = 2/7) 


The Revised Simplex Method For The Standard Form-II 


We next consider the standard form-II, where the artificial variables are required. 
There could be many versions for the implementation of the revised simplex method 
for this case but the easiest and most natural seems to be to use the Big-M method 
and solve the same by the revised simplex method. We illustrate this by the following 


example 
Max Z = 4x1 + 3x9 
subject to 
Xi +%x2 <8 
2 + Xo > 10 
X1,X2 = 0. (3.44) 


Let problem (3.44) be solved by the Big- 


M method but rather th i 
simplex method we wish to employ the revis Pn Ea es 


ed simplex method. We have 


Max Z = 4x, + 3x2 + Ox3 + 0x4 = Mxq1 
subject to 





X1 + Xo + x3 = 8 
2x1 + x7 Pat = O 
X1, X2, X3, X4, Xa > 0 
er . . i i z a 
which in the revised simplex format is exp | 
a 2 Max Z 


ali 


{ ; 1Dj 
SUD ICC b ito oe 
i D E — = ` 
- i Š 7 


ressed as 
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Here cp = (0,—M) and is GEB = (0, -M)| = (0,=M). 


Therefore the initial revised simplex tableau Ai problem (3.46) is 









Variable in the basis eio fi 


because 


s 0) iM Bee 
Gp- R ESD k 0) © O 
OO eM 


1 


First iteration 


—4 
Step 1 We have (zı —c,) = (1,0, “wl | -4 -2M 
So the negative most value of (z; — oa iwe na — Ch) = —4— 2M, ie. k = 1 and x, 
becomes a basic variable. 


2-0) = (1, “ale 1 
1 
0 
Ce — (1.0)-¥) 0 


Step 2 Next we compute ise i.e. 
i 1 0 —-M||-4 —4 -2M 
ee KA) h la | 2 | Be 
nd append the same to the tial tableau as shown here; 
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where & = col 2M, _1/2,1/2), 
1 0 24+Mi[t 0 -M) j1 O 2 
(Byte bB = (0 1 -1/20 1 O};=)0 1 -1/2 
C0 dit 0 1 OF a 


1 0: Reo 20 
8 = (Ba) 1b) = 10 1 —1/2]| 8 


GO Oe liye aO 5 
Therefore the next revised simplex tableau is 


II 
Qə 


Variable in the basis By 
(z) (%3 


Z 





Second iteration 


—4 
Step 1 We have (z? - c2) = (102) 11 = Oand 
2 
i 0 
Ca dO o = 


—] 
Therefore x4 becomes a basic variable. 


i i (0) 2 0 —2 
Step 2 We evaluate Yr = 0 eee Non = 1/2 
(Oy A IY. Wiha 


=1/2 
Step 3 The minimum ratio occurs for the variable 


X3 SO x3 becomes a nonbasic variable. 
Now E = [e1, €,€3] where 


\ \ 

pel) \\ 

/ J d | 
i i 

] 
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1 4 0110 39 

R 
a ) = |0 2 -1|| 8| = 6 
O O O 


Therefore the next revised simplex tableau is 





Third iteration 


—4 


Step 1 We have (z2—c2) = (1,4 0} 1]=0 
0 
=) | 
(3 — c3) = (1,4, 0) i Sal 
1 
M 
(z4 —c4) = (1,4,1)| 0 |= M—-1. 
1 
As all (z; — cj) > 0, the current solution namely, (x* = 8, x¥ = 0) is optimal and the 
optimal value z* = 32. 


3.7 Summary and Additional Notes 


e Section 3.2 presented certain basic results on the geometry of general LPP’s in R” 
which are then used to characterize the optimal solution. 
e Section 3.4 is devoted to provide the proofs of all the main theorems used in describing 


the simplex algorithm. 
e Section 3.6 describes a different and more useful implementation of the simplex 


= method, namely the revised simplex method. 
© The revised simplex method was developed by Dantzig and Orchard-Hays in 1954 


f ~~ : 4 

Safai ANN E i 

pct LJcwilit,/ "| yp 144 w 
ae a En 


>the main theorems in the simplex algorithm are motivated by Hadley 


OTS OT the Mal tie 











¢ 
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y®) y® 





Given that xs and x4 are slack variables and the starting basis was (x3, %4)", write the 
original LPP. 


3.6 The following is the current simplex tableau of a LPP in the maximization form 


2 
y® y®) y) 





Determine the conditions on a,c,d,f,g so that the current tableau and the updated 
tableau represent respectively 
1. Non-degenerate and degenerate b.f.s’s. | 
2. Non-degenerate and non-degenerate Daf Ses: 
3. Degenerate and non-degenerate b.f.8’s. 
4. Degenerate and degenerate Bafes S: 


3.7 Solve the following linear programming problem by the simplex method. Also, at 
each iteration identify B and B~. 


Maz 3x1 + 2x2 + X3 
subject to 
2x1 — 3X2 + DNB S S 
—x, +X2+%3 <5 
X1,%2,%X3 20. 


3.8 Solve the following problem by the simplex method starting with the corner point 


(4,2) and show the movement graphically 
ae Ma 


Y en nef nda 
_ D | e LU 4 ECL vu 0 S 





xı + 3x2 





p ™ 
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3.9 Use the simplex method to show that the following LPP has unbounded solution 


Max Ax, + 3x2 
subject to, 


Xi +X228 
2X1 +X2 = 10 
My xo 2 U. 


Hence obtain a feasible solution for which the given objective function takes the valye 
300. 


3.10 Solve the following problem by the simplex method starting with the b.f.s corre- 
sponding to the corner point (x1,x2) = (4, 0) 


Maz Sel ap 225) 
subject to 
3x1 +4x. = 12 
2X1 —-X2 <12 
X1, X2 >0 


3.11 Let following be the simplex tableau of a LPP at some iteration 


y) y”) y®) y4) 











State all possible values of a,b,c,d and e in each of the following so that the given 
statement is true 


) 1. the current solution is optimal 


` 2. the given LPP has unbounded solution 
3. the current solution is not optimal but the objective function value can be increased 
by replacing x3 by x2. 


4. the curren 


ee. * fie as * 
t solution is optimal but the LPP has man 
c} 12 if et at | ie NDAY +} mt “7 a Tt oe Te 
ee Oe LEENE sustem oJ near equations A 
J YUUCUWI Be Aidai 


= b is consistent if and 
Y te simplex algorithm to check 








y optimal solutions 
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3.13 Use the simplex method to find all optimal solutions of the LPP 


Min 2x1 i 4X9 
subject to 
xy p ZEA 
X1 — X2 2-1 
X1, X2 >0 





Verify your answer graphically, 


3.14 Use the simplex method to show that the system of linear equations 


Aix S 
A 2x1 a a ' 
X1, X2 > 0 | 


is consistent and hence obtain a solution of the given system. 


3.15 Two consecutive simpler tableaus of a given LPP are 


xp |y® y y y yO 





0 
1 
0 





















XB yd yA yD yA ye 
Xj = 2S hor. 0 
ae YS a 


O4 =1/8 j k 0 
Find the value of a, b, c, d, e, f, Q, h, i, j andk. 
3.16 Use the simplex method to verify that the following LPP has unbounded solution | 


Maz 2x1 + 3x2 
subject to | 
xX] —X2+%3<52 
—3x, +x%2 <4 
X1, Xo = 0. 


eee find a a feasible solution for which the value of the objective function is greater 


Rn DYRE —— EN 





~) = 10 for a non bas variable Xj: Find the change 
eee: nd it is given that the minimum 


f ys a eae 
4 nan 
’ Le f f AS i A 
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On tem 
3.18 Use the simplex algorithm to find a solution of the syste 
x+y = 2 
2x, +X2 = dias 


Does the system have only one solution? How does it get checked by the simplex methog 
oes 


3.19 Consider the optimization problem 


Maz viet Icjxj + ajl 
subject to 

} | = Nei, IM 
a1 me ANE 


1. Is the above a LPP? Give reasons. 
2. Can this be transformed into a LPP 
Verify your answer for the problem 


so as to solve the same by the simplex method, 


Maz 5\x1| + 6ļx2| 
subject to 
3x1 + 4x2 < 6 
X1 + 3x2 = 2. 
ees =i) Poh yor 1/5 
3.20 Let it be given that|2 -1 0 =i 9 2/5 
tate 2 oe. 0 =e =7/5 
3) 5 age ll 
Use the information to determine |2 -1 0 
ee 1 





3.21 Leto = {(%1,%2): lxil +|x2| < 1} and Sı = SU {(1,1)}. Let Conv(S,) denote the 
conver hull of the set Sı. Use simplex algorithm to maximize 4x1 + 3x2 over Conv(S1): 


. 3.22 Let it be known that 
2.0 ion 
Zl = heise l 


Use this information to evaluate P™ and Q! where 





"= « 
-= = 
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3.23 Solve the following LPP by the revised simplex algorithm 


Maz 4x1 + 3x2 
subject to 
Xy+X%2<8 
2X1 +X < 10 
X1,%) 20. 
3.24 Let S = {(-1,0),(0,1),(1,0)}. Use the revised simplex algorithm to minimize 





4x, + 3x2 over the conver hull of S. 


: 3.25 Use the revised simpler algorithm to solve the following LPP 


Maz 2X4 + 4x9 
subject to 
: 2x1 + 3x2 < 6 
Xp +X2>1 
eq: = Ol. 
3.26 Are the following statements true? Give reasons for your answer 

1. The system of equations xı +x2+x3 = 3, x1 — X2 — X4 = O, X%1,%2,%3 2 0, has three 

degenerate b.f.s. 
9 For the LPP ‘Maz 4x, + 3x2 — 5x3, subject to 4x1 + X2 + 6x3 Da x = 0X3 

unrestricted in sign’, the optimal solution (x},xX5,X3) can never have x, > 0 and 


>. 
3; M A SSE oa, Xo): a = Os = F a > 1} is a polyhedron but not a polytope. 
Meine set S = {(X1,X2) : X1 2 1 orx > 1} can never be the fesziti region of a LPP. 
5. For the constraints x1—x%2+x3 = 2, X1, X2 X3 2 0, the point (xi = 4, = 2,x, = 0) 
is @ 0.f.8. 


6. Let f : [a,b] —> R be given by f(x) = mx+c. If f attains its minimum at a point 
x* € (a,b) thenm = 0. 4 

The problem ‘Maz z = 4x1 + 3X2 subject to |x; — x2| = 1’ is a LPP. 

8. The set S = {(x1, x2) : e + Ke < 4, x2 > x1} is a convex set. 


3.27 Construct an example of each of the following (if no such example is possible then 
state reasons for the same) 
| 1. a conver set in RÌ having four corner points. 


2. a LPP having exactly 2 keie solutions 


Din space in R? l 
BES: g iie ch more than one degenerate b.f.s. correspond to the same corner point 


x 
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4 
Duality in Linear Programming 





4.1 Introduction 


By now we must be comfortable in using the simplex method for solving linear program- 

ming problems. One important aspect of the simplex method is that it not only solves 

the given LPP (called the primal) but also solves another closely related LPP (called 

the dual). The other LPP, namely the dual, remains hidden in the implementation of 

the simplex method but its solution is readily available in the optimal simplex tableau 

of the primal problem itself. In the study of duality in linear programming we give a well 
defined procedure to construct the ‘hidden’ LPP (namely the dual) and establish those 
results which bring out meaningful relationships between the given problem (primal) 
and its dual. These results, besides being of theoretical and computational importance, 
give interesting and useful economic interpretations. 


ee 


4.2 The Dual Problem: Motivation Through an Example 


Let us consider the LPP 


: 
: 
: 
: 
: 


Max 4x FP 3x2 
subject to 
x1 +X2 <8 
2x1 + X2 < 10 
X1,X2 2 0. (4.1) 






- This problem has already been solved earlier by the simplex method to get its optimal 


a n ; 
O K $ a id M 
PE f 
LR = (F J a 
3 EE i i J: = 
"EE - am a = 


a 
—— 
1 
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and 





respectively. 
Given the LPP (4.1) let us introduce another LPP as follows 


Min 8w, + 10w2 
subject to 
W + 2W2 = 4 
wi + W223 
w, W2 2 0. (4.2) 


In this construction (which is purely adhoc at present) the problem is taken in the 
‘Min’ form as the given problem (4.1) is in the ‘Max’ form. The roles of Cj (Gq =46= 3) 
and b; (bı = 8, b2 = 10) have been interchanged and in the constraints of problem (4.2), 
‘>’ sign has been taken as in problem (4.1) the constraints are of ‘<’ type. 

As for our example, problem (4.2) has only two decision variables (w and w2), we 


can solve it graphically (see Fig. 4.1) to get its optimal solution as (w) = = 2, w =1) 
and its optimal value as 26. 











= € See 


Fig. 4.1. 


of x sb aY 
EN peen ‘i a (4 1) a es 3 given here, we notice tw? 


ira sa 







Ë really p present in the last row ” 
Ee ay a 
Sa just a coincidence or is ther? 
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amething deeper in it? 
always going to happen, Sy yr dene: y of linear programming asserts that this is 
gual will certainly have an optimal ree P (primal) has an optimal solution then its 
problems will be equal, on and further the optimal values of the two 
We now proceed to the 
establish basic duality relat) 


Construction of the dual for a 


tons} general LPP and then to 
uShip between this pair of LPP's. 


43 Construction of the Dual 
Let the given LPP (called prem 


Qi l) be 
Max cy 
Subject to 
Ax <b 
x 20, (4.3) 


R 
where x € R", cE R", be R™ and A is an (m X n) real matrix. 
We now construct another associated problem, called the dual of the 


(4.3), as follows primal problem 
Min bly 
subject to 
Alw>c 
+ ae (4.4) 
where w € R™. 


The LPP’s (4.3)-(4.4) together are called the primal-dual pair. Also the components 
of vector x (respectively vector w) are called the primal variables (respectively dual 
variables) and the constraints Ax < b (respectively ATw > c) are called the primal 
constraints {respectively dual constraints). Here it may be noted that the number of 
dual constraints equals the number of primal variables and the number of dual variables 
equals the number of primal constraints. Further, if we write problem (4.4) in the ‘max’ 
form (i.e. in the form of (4.3)) and write its dual we get problem (4.3). Thus if we take 
a Bed of dual we get back A je. dual (dual) = primal. This property of LPP js 


D F. s 
a = Sy 72972 otra. 
a eo | bhas = 











pit ibs Feat pair (4.3)-(4.4) is symmetric we call 
al and the other one as its dual, i.e. what we are 

ied as ‘dual and viceve 5. However, in our 
ye al and problem | (4.4) as 

pie Oi a p ir q -dual pair an 
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Next let us consider the situation when the constraints of the given LPP e sh 
continue calling it primal) are of ‘>’ type, i.e. the given LPP has the following form 
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Max cl x 
subject to 
Ax È b 
x > 0. (4 
Now problem (4.5) can be rewritten in the form 
Max of x 
subject to 
(—A)x < (—b) 
KO 
and then using the construction (4.3)-(4.4) we can write its dual as 
Min (—b)"u 
subject to 
(-A)Tu > c 
u => 0. 
Now if we write w = —u, then the above problem can be written as 
Min bl wz 
subject to 
ATw >c 
w<0. (4. 


If we now compare problems (4.4) and (4.6), then we 


form as (4.4) except here w < 0 rather than w > 0 ee er mat (4.6) is Ohta 


is because the primal constrall 
We next consider the c 


ase when t j : 
primal problem is he constraints of the primal are of ‘= 


’ type, 1.é. t 


Tagai 
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Duality in Linear Programming 105 
Max oe 
subject to 
Ax <b 
(—A)x < —b 
ye Ui (4.8) 


Now using the construction (4.3)-(4.4), we get the dual of above problem as 





Min btu -blv 
subject to 
Alu-A'v>c 
u>=0 
v>=0. (4.9) 


If we now write w = (u — v) then the above problem can be written as 


Min bw 
subject to 
ATw >c 
w unrestricted in sign. (4.10) 


A comparison of (4.4) and (4.10) tells’ that the only difference here is that w is 
unrestricted in sign. Again this has happened because the primal constraints are of ‘=’ 
type. 

The above discussion tells that the sign of dual variables w are determined by the 
sign of primal constraints. If the jth primal constraint is of ‘<’ ype then the i dual 
variable w; > 0; if the i" primal constraint is of ‘>’ ype then the į" dual variable w; < 0; 
and if the i” primal constraint is of ‘=’ type then the i dual variable w; is unrestricted 
in sign. Here it may be noted that we have not proved exactly in this manner but above 
is certainly the conclusion of what we have proved in our discussion. 

Just as the sign of dual variables is determined by the sign of primal constraints, 
because of symmetric dual nature, we expect that the sign of primal variables will 
determine the sign of the dual constraints. Using the construction (4.3)-(4.4), it is not 

difficult to prove that if xj 2 0 then the J dual constraint will be of ‘>’ type; if x; ;<0 
pen the ea gual constraint will be of ‘<’ type and, if x; unrestricted in sign, then the 
wails pia be of = ’ type. 

struction of ithe dual is ane os ey general LPP (primal) which is 
f ic n anc summariz e thi p onin the form of a table called 

»f the dual. We ni this ta ble ia om | ‘left’ to ‘right’ as the 
1t what a fond 3 i aie geven LER (primal) 
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of type, W2 unrestricted in sign as the second oti 
hen = I 2 SOG ( ) ` ° ai 
ws $ 0 AS the third primal constraint is of >? re primal constraint is of ‘=’ type and 
i vaus J IS, 


pxample 4.3.2 Write the dual of the following LPP 
Min Z = 2X1 — X2 + Xa . 
A subject to 


2X1 +X- xa <8 
= eX eS 1 
X1 + 2X7 + 3x5 = 9 (4.12) 
X1 = 0,X2 = 0, x3 unrestricted in sign. 
Solution As the given (primal) LPP is in the ‘Min’ form we re 
‘right’ to ‘left’ and get its dual as 


Max w = 8W 1 + Wo + 9W3 
subject to 


ad the rule table from 


2w — W2 + W3 < 2 
W, + 2w3 < -1 
—W1 + W2 + 3w3 = 1 
w < 0,w2 > 0, w3 unrestricted in sign. 


Remark 4.3.1 Here it may be remarked that it is not essential to use the rule table. In 
fact we can always write problem (4.11) in the form of problem (4.3) and use the con- 
struction (4.4) to get rts dual by the first principle. Similarly the dual of problem (4.14) 
can also be’ obtained by the first principle. We use the rule table for the convenience only 
and there is absolutely no compulsion to use t. 


4.4 Duality Theorems 


Let us consider the primal-dual pair as 


y- Mok Alles 
| Si suhject to 
siaj Ax <b 
x20 (4.13) 


ae 
: My Pe U- yrd A p 
subject to 
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Theorem 4.4.2 (Strong Duality Theorem). 

(i) Let X be an optimal solution of the primal. Then there erists a W which is optimal to 
the dual. Also c'X = bW. 


(èi) Let w` be an optimal solution of the dual. Then there exists a x* which 18 optimal 
to the primal. Also b'w* = cx’. 


Here part (i) is called the direct duality theorem and part (ii) is called the converse 
duality theorem. As dual (dual) = primal, it is enough to prove part (i) only. 

There are many proofs of this theorem but most of these only show the existence of 
optimal solution W for the dual, given that x is optimal to the primal. 

Our proof presented here is constructive in the sense that given xX, we actually con- 
struct W, and hence this proof is probably more useful from an application point of 
view. 


Proof. Without any loss of generality we can take the primal-dual pair as 


Max cx 
subject to 
b 
x>Q0, 
and 
Min biw 
subject to 
ATw2c 
w unrestricted in sign. (4.19) 


Also, again without any loss of generality, we can assume that x which is optimal to 
problem (4.19) has been obtained by solving (4.19) by the simplex method. Thus x 
is an optimal basic feasible solution of (4.19), ie. for some basis matrix B, x = (xg = 
But X is optimal to problem (4.19) hence we have 
(z; -—¢j) 29 GIA 


O —cj))20 Ga 


Le. 
— p 
É 
E l m 
> ~ a 
e 
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Thus given an optimal solution X of the primal (4.19) we have on ae to Const 
a vector @, given by W = «ie as such that w is MaRS to the dua ( 19), Now if w 
prove that c’¥ = b’%@ then because of the weak duality theorem W will be Optima] to 
the dual (4.19). But this is true as 


c'X = caxp = c] (B-1b) = (c7B-1)b = Wb = bT. 


Hence the result. 


Q i.e 
| | opt 
Example 4.4.1 Write the dual of the following LPP (primal) 
Maz 4x1 + 3x2 Re 
subject to last 
GX <8 can 
2x1 +X2 < 10 (4.21) pri 
,%2 = 0, in i 
and use Theorem 4.4.2 to find the solution of the dual by solving the primal. 
Solution We have already seen that the dual of problem (4.21) is age 
Min 8w1 + 10w» Th 
subject to 
W1 + 2w7 > 4 (i) 
W1,W2 > 0. ) 
We are required to determine an optimal solution of (4.22) by solving (4.21). ba 
From Section 4.2 we know th 


(4.21) at the initial and fina] simplex tableaus for problem - : 
-21) are 



















Pro 
XB yí 1) y) y) y) ih : 
upp 
valı 
0 0 dua 
and 








feas 
unb 
n = = 
a aa or = 0x. = O 
of problem (4.29) 3 4 = 0) and we have to find W, i.e. an optimal solution me 
Now from the last tableau of the prj ci 
NOT col(4, ee © Primal (4.21) we Set cg = col(co, 


C1) = col(3, 4), (and 
variables are Xp = (3,4) ( ‘] 


n the tab] ' 
: BF} -( | €au the basic = 6 and x, = 2N and 


ilo P Hence 
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W = col, W2) = c1 B71 
2 -1 Z 
= ay ae 
( (2 1 | te 


i.e. w = 2 and W2 = 1 is optimal to the dual (4.22). Further by the duality theorem, 
optimal value of the dual equals the optimal value of the primal which is 26. 


Remark 4.4.2 The components of the vector W, namely (2,1) are also available in the 
last row of the optimal simplex tableau of problem (4.21). So the obvious question 1s — 
can we just read the optimal solution of the dual directly from the last tableau of the 


primal so that there is no need of identifying B! and computing clB|? The answer is 
in the affirmative and that we discuss in Section 4.5. 


Now to prove the existence theorem and the complementary slackness theorem, we 
again consider the primal-dual pair as stated at (4.13) and (4.14). 


Theorem 4.4.3 (Existence Theorem). 


(i) If primal and dual both have feasible solutions then both have optimal solutions. 


(u)If primal (dual) has unbounded solution then the dual (primal) has no feasible solu- 
tion. 


(uiff primal (dual) has no feasible solution but the dual (primal) has feasible solution 
then the dual (primal) has unbounded solution. 


Proof. (i) Let x and w be feasible solutions of the primal and the dual respectively. Then 
by the weak duality theorem (Theorem 4.4.1) c!x < b’w. Since bw is finite and is an 
upper bound (not necessarily the least upper bound) on the primal objective function 
value, we infer that the primal has an optimal solution. Similar arguments hold for the 
dual. 


(ii) If possible let the dual has a feasible solution. Then as primal and dual both become 
feasible, by part (i), both have optimal solution. But this contradicts that primal has 
unbounded solution. Therefore the dual should be infeasible. 


(iii) If possible let the dual has bounded (finite) optimal solution. Then by the strong 
duality theorem (Theorem 4.4.2) the primal should also have an optimal solution and 
hence certainly be feasible. But this contradicts the hypothesis that the primal is infea- 
sible. Therefore the dual should have unbounded solution. o 


> t 
Sipe 


_ The statement of the existence theorem can also be understood by the following 
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| ath Applications 
\ptimization wi he Du al 


i ral : 
ee: Sere Solution O 


i he 
i Reading t 
45 A Convenient Way for ai 
` imal LPP a 
Consider the following Pr i h + CnXn k € 
Max (4X1 + 6242 T i 
subject to 
, <b 
Aui t+: + Ain*n =" 
. < 
AnX1 +--+. T ArnXn = 
Ar+1 1%1 + TNE + Ar+1n*n 2 r+l1 
(4.24) 
Ar+s51%1 t+ alo + Ar+snXn = br+s 
Ara T a T Ar+s+1n%*n = br+s+1 
aea +++ Ar+s+kn¥n = br+s+k al 
i = 0,...,Xn Z0. n 
we can assume that the first r 7 


Here r +s +k = m and, without any loss of generality, 
3 constraints are with ‘<’ sign, the next s constraints ar 
A- constraints are with ‘=’ sign. Also b > 0. The dual of problem (4.24) is 
Min bl w 

subject to 


e with ‘>’ sign and the next k 


Alw>c 

W1,W2,...,Wy => 0 

Wr+1, Wr42,+++,Wr+s < O 
Wr+st+1, Wr+s+2,---,Wr4s4k Unrestricted in sign, 


(4.25) 


where A = (a;j) is the matrix of order (m x n). 


From the last section we know that if X i l 
a t if x is i P 
GB l is optimal to (4.25) where cg and Bu! ae = the primal (4.24) then W = 
tableau of the primal (4.24) read from the last (optimal) simplex 


.H 
dient eono A AR ere we may see that as the constraints in (4.24) are mixed 
w in (4.25) will be constrained to have differe | 


The main point which w wi nt sign 
e wish to bri ; gaens, 
Tie ng out 
evaluate c?B-! but rather read ie 8 here is that to find W we do not need t0 


; tai i i 
simplex tableau of the primal. m specific elements in the last row of the optimal 


M ao et us assume that problem (4 24) 
we may mak 

by the two phase method). Let priate changes in the arguments if i d 
the simplex tableau will ha i E try to compute the po ns e 
Bi XA 0 ave. Obviously there wil] x AS of columns which 
n columns for the variables 


| "Xn; r columns f 
| or the rs ; 
lack variables is Dyes 
> An+2;- Xnr; S columns 10! 


has been solved by using the Big 
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Y urplus variables Xn+rtl, Xn+r42 NS < ; 
the ss i '**+zAn+r+s; S columns f ai à 
Opens Xt wW or the s artificial va 
x Eea] ; a hich have been added to these constraints: and A vil 
the k artificial variables xn4549641, Xnsraze49 aints; and k columns 


; traints. Therefore the (init: ‘+++ Xn+r+2s+k Which have been added to 
‘equal to cons e the (initial) s 
vill look like ) simplex tableau for the problem (4.24) 










XB 

























XB , . . ‘ . . 
artificial | artificial 
col 
XB» umns slack surplus |columns for| columns for 
ofA columns | columns | ‘>’ type =" type 
XB constraints | constraints 


and at the optimality, i.e. in the last tableau, 2; =e) e OE a, 

n+l.. n+rn+r+1,...n+r+s,n+r+s+1,...,n+r+2s,n+r+2s+1,...,n+r+2s+k. 
Now let us compute 7}, i.e. the first component of the optimal dual vector w. As 

v = EB, we have 


W, = First component of Be 
= c! Ble}, e = col (1,0,0,...,0) 
= of (Bey) 
= Zn+1 
= (Zn+1 — Cn41) (2 9). 
This is because B~e, is the y-column for the first slack variable Xn+1, i.e. i) ce) 
and (z; — cj) > 0 for all j; so in particular for j Ik 
Similarly W2 = (Zn+2 — Cn+2) 2 Oa Ur = nr Cnr) = 0. So the first r components 


of the optimal dual vector w are non-negative as desired. * 
Next let us compute @y41, i.e. (r + 1)" component of the vector w. We have 


| = (r+1)™ component of e. Bi 
= GB Ter = -ol B (Era) 
—ch(B* (—er+1)) 


= —Zn+r+1 
= —(Zn+r+1 Ez Oneri) (< 0). 


first surplus variable, 


Wr+ 


Il 


i.e. (n+r+1) 
-he y-column for the ; y 
i = j J l > i 
-) < (0) because Zn+r+1 — Cn+r+1 = 
a 5): z A iG a ol y i fh r+s = a -(Zn+r+s n+r+s 
tor @ are non-positive as desired 


i “ae 92) 
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_ . al Re 1 
Wrst 7 CRP as 
= c,(B Cr+s+1 
= Zn+r+2s+1 
of nt : ich has 
lumn for the first artificial yariable which been 
CO 


= a pi is the y- any sign. 
This is because BW*é,4s+1 18 the Y nerds) and Zepa] can have any g 


added to ‘=’ type constraints, 1.e. y” which are unrestricted as desired, 
imi W = Znir2s+2r e s Wrtstk = Zn+r+2s+k 
Similarly Wr+s+2 = Zn+r+ ‘ad as follows 


The above discussion can be summafrl 


In the optimal dual solution W, 


‘> 0’, are obtained as the values 
(i) those components of w which are constrained to be 2 0’, a 


of (z; — cj) for r slack columns in the optimal simplex tableau. 


j i i j as the values 
(ii)those components of w which are constrained to be ‘< 0’, are obtained 


of —(z; — cj) for s surplus columns in the optimal simplex tableau. 
(iii) those components of w which are unrestricted in sign, are obtained as the values 
of z; for k artificial columns (for the ‘=’ constraints) in the optimal simplex tableau. 


Therefore if the optimal simplex tableau of the primal is given then the optimal 
solution w of the dual can just be read from the last row as specified above rather than 
evaluating c$ B71. 


Example 4.5.1 Write the dual of the following (primal) LPP and solve the dual by 
solving the primal 
Maz —2x1 = XD) 
subject to 
3X1 +X) =3 
4x1 + 3x2 >6 
X%1+2x2 <3 
X1,X2 > Q. 


rule table, we get the dual as 


Min 311 + 6w» + 3w3 
subject to 





Solution Using the 






3w + 4w TU SED 


P — Ya E 
TI E tt. RA TEEN T 
MEER U C A yn N 
aAA A V Ua- S 

<m zA / - CA s4 






ee Rove problem (dual) from the 
E the given LPP by the 


IDieyw > + EE 
Ten UGUVUICAlIC aa 
wv aT Hong AQ J 
ee > ¢ adi 
NIN 


d 
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y y2) y” 





y“ 1) 

























1 
0 1 0 
0 0 1 
and 
XB yy fof yo) 
m= 3/5 1/5 6/5 —1/5 0 
te = 6/5 =3/9 —4/5 3/5 0 
x4 = 1 j| -1 j! 
1/5 1/5 (M-2/5) (M—1/5) 0 
respectively. 
Therefore an optimal solution of the primal is x, = 3/5, x) = 6/5 and the optimal 
value is —12/5. 


Now we wish to get an optimal solution of the dual by reading certain specific 
elements in the last row of the above tableau. For this we note that w is unrestricted 
in sign and hence the value of w, in the optimal dual solution w will be the value of zj 
for that artificial variable which corresponds to ‘=’ constraint. 









Thus 
W = value of zj for the y) column 
= (zj — cj) + cj for the y) column 
= (M 2/5) (M) 
=-—2/5.. 
W = value of —(z; — cj) for the surplus column y®’ 
=-1/5. 
Ws = value of (zj — cj) for the slack column y 
‘ue > sili — 0 s 
and t is same as the optimal value of the primal, i.e. -12/5. 


_ 4nd the optimal value of the dual 


lect the revised simplex method and note that the solution of 


atically available in the first row of ipa 


7 


ie eer | f 
fae iano f - ~~ 
PT US TECC) Uv 
se LE U T OA à ad 
a i = ws S-i 
T a= | 
_ a AD D aoe & 
? rr fr 


eee) y 
; “+ AAR sI StL 
a= aS! DB, ar, auto i 
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Deductions fro 


tary Slackness 

m the Complemen 

4.6 Some Useful 
Theorem 

m 4.4.4 provide certain 

l), knowing the optimal 

plementary slacknesg 


tated in Theore 


f the dual (prima 
sider the first com 


ded as 


y slackness conditions as $ 
out the optimal solution O 
For this let us COn 
= 0, which can be expan 


ya È A X7— b; | = 0. (4.26) 


E VE 


The complementar 
useful information ab 
solution of the primal (dual) 


condition, namely w (Ax — b) 


But in (4.26), the finite sum of m non-positive quantities equals zero and hence we 


get 
n 
Wi Y Xj = »| = (i = 1; D ns ,m). (4.27) 
j=l 
Now for each i, the L.H.S. of (4.27) is the product of two numbers and therefore 
A 
aj; X; < bi > w; = 0, 
= A i (4.28) 
and 
is n 
Di > 0 = ) fij Zj = by (4.29) 


j=l 





In view of the abov ions w 
e relations we i hat i 
th i e inf he optima 
the 7 constraint holds as strict Wi the ee tel poltion Tof the primal 


i component n W then in the opti 

nponent namely 7; en in the optimal solution W 
dual, w; > 0, then at the Un “ ach Equivalently if in the o . hei ie dual, the 
an equation. Therefore (4,28) e inl nn aes 
-28)-(4.29) are also called os el imal constraint holds 4s 
ntary slackness conditi 
tions. 









Economic Inte 
rpret 
paki pretation of the Complementary Slackn 
eer ae ackness Condit 
ions 


HMA a a a a nomic ji : 
nd related dual interpretation then 1t 


SVOM dualit (r res . 
‘> this ; true and “ila also have some 
he r details we may refer 


ip 

VASS = 

6 E a= aa k < 

ALLL ~ D t. ~ J 
cy 
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fa S J63, 13 
aeaninse of +} T, 80). However, as a! 


~ALiI| 15 ot the : j 
"He complementary slackness 
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In the primal problem (4.13), let us interpret the parameters b, c and A as follows 


b; = units of the i raw material available (i = 1,2 
c; = per unit profit for the j" product jee n&r $ ,m) 


. th 
ajj = units of i" raw material used in produci th 
Peevey, 7 = 1,2,...,n). p ng one unit of the j' product 


Thus we os given m different raw materials which can be used to produce n different 

ducts. These raw materials ae available in limited supply and the entire output can 
be sold in the market. Our aim is to obtain the units of each product to be produced 
so that total profit is maximum. It is reasonably simple to see that the mathematical 
model of the above problem leads to problem (4.13). 

Let us now look at the dual problem (4.14) in the context of the economic interpre- 
tation of problem (4.13) as described above. We note that the dimension of cj is rupees 
per unit of the 1” product and that of a;; is units of the ith raw material per unit of the 
j} product. Therefore the constraints 1 4ij Wi = cj imply that the dimension of Wi 
must be per unit of the i” raw material because in the given inequality the dimensions 
of both sides must be same. 

Thus to each raw material i, there corresponds a dual variable w; which gives the 
valuation (or price) of one unit of the i" raw material. In economics w; are called the 
shadow prices and are different from the actual prevailing prices as explained below. 

Let us consider the situation where it is decided that the raw materials on hand 
should be insured against fire, theft etc. This insurance is intended to protect the total 
income of the products after they are sold in the market. The problem is to find that 
insurance scheme which is large enough to provide the full compensation and at the 
same time minimizes the total insurance cost. It can be seen that the mathematical 
model of this later problem is precisely the dual problem (4.14). 

The dual variables w; are called imputed values or shadow prices for the various 
raw materials. As mentioned earlier these w; are not the actual per unit costs of raw 
materials but rather what we really perceive about them. If we now look at the first 


complementary slackness condition 


n 
Dagny te 
j=l 







— ihat if in our profit maximization problem, at the optimal production 
iced fully (ie. < bi) then that raw material should 
TerTrlal be) L OL D Pr ed used “re > Cea - 


because in the optimal solution of the 
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‘em also opens UP ; 
orem se it says that if 


on with Applications 

LPP’s 
Alternate Algorit possibilities of developing ney, 
we achieve the primaj 
y slackness conditions then y, 
three things to be achieved 
two of these throughout ang 


nentary slackness the 


The complet p’s. This is becau 


ithms for solving L 3 
re the dual feasibility Fant 
have optimality for both prima, Jone 
the end, we may think of algori s 
stops when the third is also a or 

In the simplex method we start p H cs 
remains feasible for all iterations. We also m 


y stop W j ility l 
throughout why? and st hen the dual feasib | 
Instead of ks a can think of an algorithm which starts from a dual feasible solution 


: : hout (1.e. all (z; = Cj) > 0 for all 
(ie. all (zj — cj) 2 0) and keeps it dual en CoR P, ‘ fae i a 
iterations); maintains complementary JCA O ptt ail of the complementa 
the primal feasibility is achieved (i.e. all xg; 2 0). This, ee view í l ty 
slackness theorem, is certainly a valid algorithm for solving LPP’s. This algorithm is 
called the dual simplex method which we shall study in Section 4.7. Another possibility 
could be to start from a basic solution which is neither primal feasible nor dual feasible 
and combine the simplex method and the dual simplex method suitably so that in the 
end all conditions of complementary slackness theorem are met. Such methods are called 
the criss-cross methods, which have been studied mainly by S. Zionts [172]. 


hich maintain any 


| feasible solution and make sure that jt 
mplementary slackness Condition, 
< attained (i.e. all (zj — cj) > 0). 





4.7 The Dual Simplex Method 


solving the dual of the v: hod is essential] lent to 
given probl y equivalen 
Now let the given LPP F em by the usual simplex method 
Max cy 
subject to 


(4.30) 
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step 1 Start with a primal basic Solutio = B~! 
ie- c) 2 0. n Xg = B™'b (for some basis matrix B) for which 
Step 2 Check if all xp, > 0. If the answer is tveg’ t 

ven LPP. If some xp, <0 then we my aes hen we stop as xg is optimal to the 
Step 3 Find the negative most value of Xp;, i.e. xg, = Min{xg, : xp, < 0}. Then xp 
i qi sal i i r 


of the variable Xg, leaves the basis. The row corre- 
es the pivot row. 


Zr — ae 
egy) w = Max} cj) : Yr; < o} 
Yr, j Yr, J 


then the variable x, becomes a basic variable and the corresponding column a enters 
the basis. The column corresponding to (zę — c,) becomes the pivot column. 


becomes nonbasic and column b0) 
sponding to the variable xg, becom 
Step 4 Find 





Step 5 Get the new dual simplex tableau by using the pivoting which is exactly same 
as in the usual simplex method, and then go to Step 2. 


Here we make the following observations 


(i) In the dual simplex method we first find the pivot row and then find the pivot col- 
umn, whereas in the simplex method we first find the pivot column and then find 
the pivot row. 


(ii) The pivot element in the dual simplex method is always negative, whereas in the 
simplex method it is always positive. 


(iii) The maximum ratio criteria in the dual simplex method is used to make sure that 
all (z; — c;) remain non-negative in the next iteration, whereas the minimum ratio 
iena k the simplex method is used to guarantee that all xg; remain non-negative 


in the next iteration. 


(iv) It is possible that in the pivot row there may not be any yrj < 0. In that oe eee 
be shown that the given LPP is infeasible. This situation is very similar to the a 
simplex method where we know that if in the pivot column there ie = Yj ie $ en 
the given LPP has unbounded solution. As ‘unboundedness’ and easibility’ are 
ee  -. natural that the dual simplex method should check infeasibility 

eee CED LS; 1T ee simplex method checks unboundedness of the given LPP. 


W LILAaAL 1 









ox method seem to be very natural but Step 4 
We are | ziven t nat for the current dual simplex 
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basic aim of Step 4 is to 8U 
as well. But 


arantee that all Zj 
Then the 
for the next tableau 


Y iy 
G -c;) = zj- oH) - Tii ck) 


and therefore (Z; — cj) 2 0 means that for all f, 


(z; - ¢j) - 7 — Cy) 2 0. (4.31) 


ck) > 0, yrk < 0 and therefore (4.31) trivially holds 


nm (A. -—¢c;) 20, (Zk - . : 
Now in (4 31), (Z; j) ( ore P ieee to bother about those ] for which 


for all j for which y,; 2 0. There 
y,j < 0. In that case (4.31) gives 
E o is 
oma SES 
Yrj Yrk 


l.e. 





AC Zig 
Yrk J Yri 
which is nothing but the maximum ratio criteria as given in Step 4. 


: Yrj < 0}, 


Example 4.7.1 Use the dual simpler method to solve the following LPP 
Maz —2x1 — X2 
subject to 
2x1 -X2 - x3 > 3 
Me 2 Xe > 2 
X1,X2, x3 2 0. 
Solution We introduce the surplus variables 





X4 and x5 to get 
ca 2x1 — x2 + 0x3 + 0 i 
subject to eOK 


2x1 = 2 ~ x3 — x, = 3 
*1~%2+%3— x5 = 9 
ie *1/%2,X3,X4,X5 > O. 
™ taxing the variables x, an. 
Daa Xa and X5 as basic Variables we ob 
uhii ean e obtai 


pn — 







th 
i avis 


Po 


u% q A = 
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y® = al D a 
0 —] 0] VO 
peat 0) ee 
OF Sh \ ae 
CB =(5}, a 
(zı —c,) =0- (-2)=2 


(Zo — co) =0-(-1) =1 

pean 0=0 

(Z4 —ca) =0-0=0 

(zs —c5) =0-O0=0. 
As for the basic solution xg, all (z; — cj) = 0, we have got the right situation to use the 
dual simplex method. The initial dual simplex tableau is 


xp || y  y® y yO yO 





First iteration 


a 2 As not all op, 0; the current solution is not optimal. 
P 3 xg, = min{xp. : xg; < 0} = min(—3, —2), which corresponds to x4 and therefore x4 


econ les nonbasic variable. 


~ 
EAN 
e 


Step 4 Asi in the bigot r gow poo one yrj < 0, we 






get the first column as the pivot column 


after eo we ti the next tableau as 


(2\ 


Eita 
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Second iteration ‘mal as xs = 21/2 < 0. So we identify the Pivot 
The current solution is ele ps ti variable To identify the pivot column We 
d S1C se 
s becomes a nonbas 
16, 
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row and conclude that x 
use the maximum ratio criteria, 


max( 75 =a) 2/3, 


which corresponds to the value (23 — c3) and therefore X3 
also note that the pivot element is —3/2. After pivoting, 
y” y®) y y® 
ieee 0 -2/3 -1/3 
-1/3 1 1/3 -2/3 
(Meee Wlh.2/3 2/3 


As all xp, 2 0, we obtain the optimal solution as (x, = 5/3,x, = 0,x, = 1/3) and the 
optimal value as (—10/3). 


becomes a basic variable, We 
we get the following tableay 


xg lly 




























—— 4.7.2 Use the dual simplex method to check that the following LPP is infea- 
e 





Max -x, 
subject to 

%1—-X223 

=X + X2 > 4 

X1, X2 => 0. 

Solution Takin 
g x3 and x ; 

edie 4 as the surplus variables we get the following dual simplex 






ž 


E ="? is infeasible. Fig 11.3 





Scanned by CamScanner 





E Lss 

2 Duality in Linear Programming 125 

mark 47-1 In Example 4.7.1, if we take the b 
0) 


i . s Ie : : - 
ae a) = —2 and therefore the dual simplex si Wi En function as (2x) ~ x2), then 
AL nns there does not seem to be ; 7 no more applicable. In these 

ough there are some starting solution of the dual simplex 
method. Thoug j AA “i the literature to find a poe Pita hee 
owed by the dual simplex method, they are not 


ariable technique of the usual simplex method is 


n the given LPP ig į ae ae ; 
i +25: oes te 8 in the minimization form with 
ox 0 all constraints are with >’ sign and b > 0, the dual simpler method sd major 


advantage over the simplex method. However the dual simpler method is extremely useful 
in another way as tt becomes a very handy tool for studying post optimality analysis and 
integer linear 2S ili les While post optimality analysis is a topic of discussion for 
the next section, integer linear programming is studied later in Chapter 6. 


48 Post Optimality Analysis 


The optimal solution and the optimal value of a given LPP generally depend upon the 
problem parameters, namely c,b and A. Let LP(c,b, A) denotes the linear programming 
problem for a given c,b and A. 

In post optimality analysis we are given that the problem LP(c,b,A) has been solved 
by the simplex method, i.e. Cpe bev ao = 0) and. ZO CRXB are known 
together with the optimal simplex tableau for the basic matrix B. Later we are told that 
at the modeling stage, these parameters were not correctly specified; they should have 
been taken as ĉ, b and A respectively. In other words we were supposed to solve the 
problem LP(¢, b, A) whereas we have actually solved the problem LP(c, b, A). What is 
the best way out now? Obviously we do not want to solve the problem LP(¢, b, A) from 
the very beginning and therefore wish to see if we can do something better. Can we 
really start from the optimal simplex tableau of LP (c,b, A) itselt and make appropriate 
modifications so as to get an optimal solution of LP (ê, b, i) ig Ties pee 
analysis aims to provide answers to ee ee a limited way and the dua 
simplex basic tool to be employed. | 
ii a es ore limit ourselves to the following type of changes in the 


"Aw Lo 
Problem parameters 
t = 2 f. 












T 


TA } FS aea ee 
i) change in the resource vector b 
> =e 
ze in the cost vector C 
: n E ee P i 
Vat ss) Cl oh at? Ai Fae 
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di ach of the above cases and explain the working through the 
We shall now discuss e 


: ly. 
below given example on 3 blem 
. - rogramming pro 
Soars pas : z = 3x, + 4x2 + X3 + 7X4 


subject to 
8x1 + 3x2 + 4x3 + x4 < 7 
2x1 + 6x2 + x3 + 5x4 < 3 
X1 + 4x + 5x3 + 2x4 < 8 
X41, X2,%3,X4 = 0. 


(4.32) 


Let the above problem be solved by the simplex method. The optimal simplex tableay 
is 7 
XB y® y” y) y y) yí ) 

x1 = 16/19 1 7/38) S20 5/38  —1/38 0 
X4 = 5/19 0 21/19 9Q O © 4/49 0 
1 

0 


y” 


















X7 = 126/19] 0 59/38 22 Or T38 215/38 
83/19 0 169/38 1/2 0 29/38 


Thus the optimal solution of the given LPP js (x; = 16/19, x5 
the optimal value is z* = 4 
























53/38 
= 00, = 0.2% = 5/19) and 


(i) Change in the Resource Vector b 
We shall first discuss the change in the resource ve 
be changed to a new vector, say, b. The effect of b c 
of the b.f.s Xg, which is given by, xg = B-1p. Here, i 


ve function as 2(Xp). But, if some 
component of Êg is less than zero, then the origin S is no longer feasible for 
the new problem and hence we have to use the dua] simplex method to find the new 
optimal basic feasible solution. 
Let us examine the change in the O 


ptimal solution if U= 3 8)T 
b = (13,3,8) in the above example. 


is changed to 


- Let us recall that the 


optimal simplex tableau 
umns j 


5/38 -1/38 0 
B =| -1/19 4/19 p ; 
-1/38 -15/38 1 








becau 
gimp! 


Here, 
is not 
b.f.s a 


Thus 
and tl 


+ 


1 E Let us 
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because these are the columns corresponding to the slack variables in the optimal 
simplex tableau. Now, 


2 5/38 -1/38 0)\(13 31/19 
fp =|| =B”b =| -1/19 4/19 Of} 3 [=| -1/19 }. 
ò -1/38 -15/38 1)\ 8 123/19 





Here, we observe that %4 takes the negative value and hence the new basic solution Xp 
is not feasible. Therefore, we use the dual simplex method to obtain a new optimal 
b.fs and get the following tableaus 


Xp | y® y® yO y® y) y y? 
Sao M OE ee eee eee A A M M 
%,=31/19) 1 9/38 1/2 0 5/38 -1/38 0 
2ĝ& =-—1/19} O 21/19 0 1 |-1/19 4/19 0 
#7=123/19)} 0 59/388 9/2 0 £-1/38 = -15/38 1 
87/19 || O 169/388 1/2 0 29/38 53/38 0 

è | y y ff y yO yV y 
NE nt ii `- `~ 
AA PE a a 52a re 
a E een 8. 19. il 4 0 
è = 13/21 0 e 9/2 -1/72 8 Fp 
9/2 || 0 Ae. 1/2...» cOatininuee 


Thus the optimal solution of the new problem is (x) = 3/2,x, = 0,x3 = 0, x} = 0) 
and the optimal value is z* = 4.5. 


(ii) Change in the Cost Vector c 


Let us next discuss the change in the cost vector c. In this situation there may arise 
two cases. 
Case 1: The change is in cj where j belongs to the set of indices of the non-basic 
— i.e. the change in the cost vector corresponds to the Rabepagic variable only. 
ick piee will not affect the objective function value, AEE = SRo as this does 
Ea tefect t the eny gee that may change are (2; i cj). Thus the 
y ill affect zj — ĉj value, as zj — ĉj = ei — ¢;. If this 
imp! me pot hod is used to obtain the new optimal 
the optimal value and the timal solution. 
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1 


, ‘able and hence the UPdateg 
he non-basic var 
h respect to t 


Here, the change in ¢ is wit 
tableau is as follows 


(3) 
(1) y” y 
Xp 2- 9/38 1/2 0 5/38 








—1/38 










= 16/19 gi 4/19 0 

p at 0 i ie : E —15/38 1 
3 

£7 = 126/19 a i O 29/38 53/38 0 


j as ffected the opti 
Since all z; — c > 0 for all j, therefore, this change in c has not altec ptima] 
Inc jej ) 


i h timal value. TI | 
AN i E is in Cj where j belongs to the set of indices of the basic 
ase 2: 


vectors, i.e. the change in the cost vector corresponds to ieee i ~~ 
be the changed cost vector. In this case the objective function = 3 3 
and also the relative cost coefficients (z; — cj) = chy) TC all oe change. ore ore, 
we compute the new objective value z(ĉg) = C1 XB and aay reiia cost coeficients 
2) = ant, —c; for all j belonging to the set of non-basic variables. Now, similar 
to Case 1, if for some j, ĉi =c; < 0, then the simplex method will be used to obtain 
the new optimal solution, otherwise the current b.f.s will remain optimal. 

Let us study the effect of change in c from (3,4,1,7)! to (2,4,1,8)! in the above 
example. 

Here, the change in c is with respect to the basic variables and hence we compute 
new value of z(%g) and Ge c;) to get the updated tableau as follows 


Lp Ye yy 

x; = 16/19 

X4 = 5/19 

£7 = 126/19 
83/19 


j Note that in this 


because for basic variables it has to be zero 
| and hence we use 


i (ili) Addition of a 


y® y) y”) 
1 9/38 1/2 0 5/38 —1/38 0 
0 21/19 0 1 —1/19 4/19 0 
0 99/38 9/2 0 =—1/38 15/58 i 
0 0 —6/38 62/38 0 


2i — cj for all non-basic variables only 
. Now, in the updated tableau 25 —¢5 <9 




















































EET TIT ee ~ ate two cases to be disc 
F Satisfy the given additional constraj 
Vase 1 If the new co aStraint is < 


ussed because the 


a Current b.f.s may or may 


ee Problem as well. Let us dis 


with respect to problem (4.32): 


[oag sg Pa y kr.. d 
F ai @ > Za ‘ale AA... 
eee, EW). when HON an es 
“aa E SAN G; LS AA vs J r 
_ Sein the cur tim 

~ 7 k. 

pS e rent o 

vs m MoN. Sa 4 Ep i ) f f : 

z are Aie 
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E + X3 t 3x4 S 4, bec 

dx + 3x2 ; s a m Suusa E x 16/19 + 5 y 5/19 = 

we solution remains unchanged, = 57/19= 3< 4, the op- 





he 


Case 2 If the additional constraint is violated } 
P: then the problem has to be solved further b s the current optimal solution x, 
sideration. Each new constraint in the eo ning the new constraint into con- 


: form of an i 

PE . a 4 m 3 s r ’ » 
 yariable and if necessary artificial variable alk e a auai EREE TO a slaak 
aa able also. For the constraints in the form of 


equations, — TAC ARR are introduced. The problem is then solved by the dual 
simplex method. Here 1t must be noted that the new additional constraint can not 


be taken directly in the optimal simplex tableau of the gi : 
€ > give 
has to be first expressed in the canonical form. Bi ink a 


Let us study the effect on the optimal solution after introducing the additional con- 
straint 2x1 + 3X2 + X3 + Oxa > A, 
Here, the current solution violates the new constraint. Therefore in order to obtain 


the new solution, we first express the given constraint in the canonical form and for 
that we write it as 


2X1 + 3x2 + x3 + 5x4- X8 = 4, (4.33) 


= where xg > 0 is the surplus variable. 
-Now the above equation has to be expressed in the canonical form, i.e. it should have 
| only one basic variable with coefficient as ‘one’ and all other variables in the equation 
must be nonbasic variables. For our example, we need to make xg a basic variable 
= and express the other basic variables, namely, x; and x4, in terms of the nonbasic 
= variables. But from the first two rows of the current optimal simplex tableau we 
= have 


‘ih k: s xı = 16/19 — (9/38x2 + 1/2x3 + 5/38x5 — 1/38x6) (4.34) 


X4 = 57%9— 21/19% + WAIKS = 4/19x6. (4.35) 


A1 allG Ad 
= “ets 5 ri 


a. 


from (4.34) and (4.35) in (4.33) we get 


E T 3 
-> > 
7 7 ra t 
#4 TE p PT á p> A 
i AA n eee: amen 
am Erg are 4 a r oe © 5 
208 « bee. fm g4 
Ca 
o i} A h 
+. E á l — 
£ 
Eo x - 
$ Ta y ay ima r5 
pow AV U 
D 


(4.36) 


L 7 at E y E i r} J 
è ai eo è D — 
Ss ARA 
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i 8 
2) (3) yí ) y 
aaa y 


5/38 1/38 0 0 


=f 1/72 0 tie 

9/38 ġà ive 
ay ae 
nv HER 0 59/38 9/2 5 ; 2 : 
a 3 () 5 
+ : 169/38 W/Z 0 29/38 53/38 


timal solution 
Now by employing the dual simplex method we can obtain the op af 
ow by 


the new problem. 
(iv) Addition of a New Variable 


Since addition of the new variable is dual to the addition of a new constraint, this 
case can obviously be handled by the usual simplex method. 

Let the new variable to be added be Xy41. Also let a;n+1 (2 =1,2,...,m) and Cy41 be 
its coefficients in the constraints and and the objective function, respectively. Since 
the number of constraints remain the same, the original optimal solution remains 
basic feasible for the new problem as well. Therefore we only calculate Zn+1 — Cnt. 
Obviously the evaluation of Z„+1 — Cy41 will need the evaluation of vector yd) ti 
Zn+1 — Cn+1 is non negative, then the current optimal solution remains optimal for 


the new problem. If not, then we use the simplex method to obtain a new optimal 
solution. 


We study the effect of introducin 
the activity vector (3, —2,4)" wit 
uate (Zg — cg), we first find 


& a new variable xg with the cost coefficient 4 and 
h respect to problem (4.32). Since we need to eval- 


17 
jn Sear | ine ies 
eee bes. all 4 179/38 J 


and this gives zg — cg = oT (8) 
VEC = -225/38 Therefore the updated tableau is 
Fe eee 


ea 
ee 


re) E 


= 6 







s 
= 
Sater, i 


-a 





=a 





n ws ste a 
i. e a ana | ~= —— — 
dnt 
-/10 i 


y3) 





g 
“Y m er A 
A i ~ — 
> San m =F 


ip is 9/38 T~ y” y®) 
126/19|| 0 17/38 
0 -11/19 
1 {179/38 
0 255/38 
Solution 


Scanned by CamScanner 











ye Duality in Linear Programming 131 
i _) Deletion of a Variable 
4 pa 


a Here we need to consider two cases. 
J 1 If the deleted variable is a non- 
optimal basis, then the original solution r 

the variable in the optimal solution make 

Case 2 If the variable to be deleted is pr 

its removal will affect the solution. To 


basic variable or with zero value in the 
emains optimal because the zero value of 
S the variable nonexistent in effect. 

esent in the basis with positive value, then 


! obtain the new solution we first make this 
basic variable to leave from the basis forcefully, i.e. we replace this basic variable 


with some non-basic variable and make it a non-basic variable. Once it becomes 
non-basic variable, it can be removed from the tableau as it has zero value and its 
removal does not change the optimal value. 

We consider problem (4.32) and study the effect of deletion of variable x4, and hence 
find the new b.f.s if it exists. As x4 is a basic variable, its removal will affect the 
optimal solution. Therefore as discussed, we first make it nonbasic and then delete it 
from the basis. Hence we replace the variable x4 with x2 to get the following tableau 

























XB yD YO yO yA y®) yO yO 
x; = 11/14 12. -3A alae 
x = 5/21 0 1 Oa 19/25.) Si A 
x7 = 263/42 0 0 9/2 -59/42 -1/21 -29/42 





43/14 0 169/38 1/2 0 29/38 53/38 
Hence the new b.f.s is (x5 = 11/14, x; = 5/21, x3 = 0,x, = 0). 


(vi) Deletion of a Constraint 


If the constraint to be deleted is such that its slack variable has a positive value in 
= the optimal solution then its deletion leaves the optimal solution unchanged. This is 
because the constraint is not being satisfied as an equality by the optimal solution 
a it is ineffective in determining the optimal solution. Therefore the ineffective 
constraint may be deleted from the problem per changing the optimal value. 

These c ; inactive constraints. 
ie lipa iy deleted is such that its slack variable has ZETO value 
Pe ss the constraint is an active constraint, then its deletion 
ea aa, i ze In this situation we first make the active constraint 
a, ate. Jack variable in the basis and then delete the 
FE. ay. above procedure with the help of problem 
Š k T oe i 
















x <1 a 
+. wd = eh ee ~ To a 
a a NOVE DIU GESU K 
t Livy rf — 






à =~ 
g 5 tt p 
Pe 
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he constraint 8x1 + 3x2 á 4x3 + x4 <7. Fron 
- tableau, we observe that the variable x5 is the slack Vatiabj 
x tableau, straint and it is not in the basis. Thus the giver 
t. So we make it inactive and then drop it and fy 
he basic variable x1. This gives the following 


Let us study the effect of removal of t 


the optimal simple 
corresponding to the above col 
constraint is an active constrain 3 
that we insert x5 in the basis replacing t 
updated tableaus 






















xp y! 1) y” y) y” y) y® y” 
w = 16/19 1 9788 = 1/2" 0 —1/38 0 
Rae 07 19 0 2119 0 1 —1/19 4/19 0 
x7 = 126/19 0 59/38 1m9 27E —1/38 -15/38 1 
83/19 0 169/38" 1/2  O 29/38 53/38 0 
eo 1 yo yn 













38/5 9/5 19/5 O 
A T GE E T O, 
1/5) = 8/5. 2o W 
“EIR Sa O .7/5 . 0 





(vii) Change in the coefficient matrix A 


Here again two cases can arise. 
Case 1 If the vector q) 


a . is changed to g(i) l 
original Optimal solution, then the modified where Xj 1S a non-b 
<j —C; 2 0 for all j, ' j (2; i Cj) can be co 


<j =C; <0, then fur j . Solution Temains unchan 


asic variable in the 






OI q 
21,922,493 from (3,6, 4)T 


basic Variable. Therefore the 
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29 = C2 = ChB“ My — oy 





9/38 -1/38 0\(3 
r (3 | 0) se) AN 4/19 0O||6)—4 


y =1/98 -15/38 1)\1 
= —2.184 


I 


Since the value of 22 — c32 is less than zero, therefore the simplex method can be used 
to find the new optimal solution. 

Next let us study the change in the coefficient matrix corresponding to a basic 
variable. Let us change the values of 411,412,413 from (8,2,1)! to (8,3, 1)’. Since the 
change is in the coefficients of xı which is a basic variable in the optimal solution, 
to study the effect we introduce a new variable say £4 with coefficients (8, 3, 1)" and 
the coefficients in the objective function as ¢, = 3. Here we assign —M cost to x1 


and perform Big-M method by treating xı as an artificial variable, and introducing 
%, as a new variable. 





8 

XB yD yyy yyy ——® 
a s PM OM iga con Higg pat 38 
09 = y 0 21 0 Ji =l ES 0 ji 
” _ lis) o bo 9 0 Ei te ee 
re : : 3 : 

= —~M- —5M—14 56+M —37M+76 
E -M 0 a =e 38 38 0 38 


Since the value of 2g — cg is negative most, therefore the simplex method can be used 
to find the new optimal solution. 


4.9 Two Person Zero-Sum Matrix Games 


In this section, we present certain basic definitions and preliminaries with regard to two 

l mes. l 

us cts the F A Euclidean space and R} be its pe i 
| Let A € R™" be an (mxn) real matrix and ge =(1,1,...,1) bea ieee x K 
Ag: dimensi on is specified as per the specific context. By a Te ya e in 
same G we mea n pine iee G= (S S", A) where 9 x i m tivel S") j is PF 

Aia e termino logy © a fie c matrix game theory, 5 Em ively 
ectiv 










ae TAK HA II) and A is called the pay-off are 
ads i (0,0,...,1,..-,0)" = 
A. 1,0, as 0) Sie nen 1 is at 
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| I (respectively Player II). If py, 

re strategies for Player ay yer] 
the j® a apii ate Player II chooses j™ pure strategy PA Aij 18 the amo 
sp PI ver I Player I. If the game is zero-sum then —4jj 18 the amount pajig 
paid by Flaye 


is the loss of other player. The quay: 
Player I to Player II i.e. the gain of one player “ a ax fb Player Ul, as leme 
~ xTAy is called the expected pay-off o Mi K 1 AS CREN 

E(x, y) = x Ay n) can be thought of as a set of all probability distributions Over 
S™ (respectively S”) can o7,2).,.,7))-Jt is customary to assume that Player Į i, 
eee Pe et’ ine player. The triplet PG = A); 
a maximizing player and Player I is a minimizing play i) a m r A) ig 
called the pure form of the game G whenever G is being referred a | extension 
of the pure game G. We shall refer to a two person zero-sum M always ss) G = 
(S™, S", A) and if the game is in the pure form It will be clear oa the context itself 
Thus, for us S” refers to the (mixed) strategy space of Player I, S refers to the (mixed) 
strategy space of Player II, and A refers to the pay-off matrix which introduces the 
function E : S” x S” > R given by E(x, y) = xT Ay, called the expected pay-off function, 

The meaning of the solution of the game G = (S”, S”, A) is best understood in terms 
of maxmin and minmax principles for Player I and Player II respectively. According to 
this principle, each player adopts that strategy which results in the best of the worst 
outcomes. In other words, Player I (the maximizing player) decides to play that strategy 
which corresponds to the maximum of the minimum gain for his different courses of 
action. This is known as the maxmin principle. 

Similarly, Player II (the minimizing player) also likes to play safe and in that case 
he selects that strategy which corresponds to the minimum of the maximum losses for 
his different courses of action. This is known as the minmar principle. 


Employing the maxmin principle for Player I, we obtain v = MaxMin (x Ay), called 

— XESM yesn 
the lower baie of the game. Similarly the minmax principle for Player II gives 0 = 
MinMax (x` Ay), called the upper value of the game. It is well known that v > v. The 


yes” xeS™ 

main result of two-person zero-sum matrix 
equal, i.e V = v =v", which is then called th 
is very useful in this regard. 
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game theory asserts that, in fact, these are 
e value of the game. The following theorem 


Theorem 4.9.1 If there exists (x", y*, v')ES™ yx gn 


(i) El, y) > 0", Vy € S”, and 
o (W Ex, y") <0, Y xes, 


X R such that 






g O m i 
EROAA ee a a a : ) 
wen U =Y = UG if a conversely 
— Peis) e- 
4 m skd“ E f wie = 


6 ae Lee n y a Vey S = A 
Hnitrinon A Q 1 MON S34. i 5S 
AJE. l ke ay P I | | - Pp” i A F TiO Th 3 + ` 
A Loto \ Wo QUIC DOINT } 







are, 
ERTA } 






â T 4 
r n ‘ mm? 
8 Corollary for Theorem 4.9.1. 
| F tok s+ 


PICT 
_ E 
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corolla ry 4.9.1 A necessary and su ; i 

j (x Ay) = MaxMin or condition that 3 =v i.e, 

Me P). "EP Bie ee a ; Nat the function E(x,y) has a saddle point 
a’ 3 — y T 7 — U pama v. 

Xa x. 


Theorem 4.9.1 leads to the 








tollowing definition of the solution of the game G. 

ition 4.9.2 (Solution of 

p ot) grappa > R. à game), Let G = (S", S", A) be the given game. A 
iriplet (X,Y ` ‘s called a solution of the game G if 


Ea yao, Y yesm 
and 


Elx y) so, V xes. 
Here X 3s pared an optimal strategy for P layer I, y* is called an optimal strategy for 
Player I, and v" ts called the value of the game G. 


Remark 4.9.1. In view of Theorem 4.9.1 and its Corollary 4.9.1, (x*, y*, v*) is a solution 

of the game G if and only if (x*, 4*) is a saddle point of E and in that case v* = E(x", y’). 

Such a saddle point is guaranteed to exist if v = V and conversely. Here it may be noted 

that only the existence of (x, y) € S”xS" such that MinMax (x' Ay) = Minnie (x Ay) = 
LESM yest 


yes" xesm 
x AY, is not a sufficient condition in order that (x, y) be a solution of the matrix game 
G, ie. this may not imply that (x, y) constitutes an optimal pair of strategies. For 


— 2 0 i E kW. a 
example, if G = (S°, S7, A) with A=|) > then v = 0 = 1. Also x =(; >) =y 


22 
constitutes a saddle point of E and therefore a pair of optimal strategies. However 
x= G 5) ,y=(1, 0)! also gives E(%, y) = 1, but y is obviously not optimal to Player 

2 2 . . . 
Il. The main reason being that (x*, y*) is a saddle point of E(x, y) but (x, y) is not. 


Next we answer the basic question regarding the existence of a solution for the game 
G. The following theorem is very fundamental in this context as it asserts that every 
two-person zero-sum matrix game G always has a solution. 


Theorem 4.9.2 (Fundamental theorem of matrix games). Let G = (S™, S”, A). 
sii ‘Th en MinMax (xT Ay) and MaxMin (x! Ay) both exist and are equal. 
ie eee ay Sn xeSm xeS™ yes" 
| > , T 
dan "= ih ively MaxMin (x` Ay) is called Player 
dere the problem MinMax (x Ay) (respectively xeS™ yes" GE 
ia ae nroblem. If there exists (to, jo) € 1X J such that aj,; 2 
eee) 7) is called a pure saddle point and in that 
et “hs ies oo in the pure form. In this situation 
eA L ae | ? EN a be 







‘ t 
5 =i; _ 5 isa - . 
} \ A 141 ` A. BE = J J ra i ee 
rege: | ° 


|_ ~x 
| 


b 
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Thus Theorem 4.9.2 above guarantees that every two person zero-sum matrix 


G has a solution. If there is no solution in the pure form then there ís certainly a soluti 
in the mixed form. Therefore, the question ‘How to obtain the solution for this Matriy 
game G?’, is to be addressed in the next section. 
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: , ith t i 
noted that a; ;, is the smallest element in the it” row and the largest element, in the jà 


4.10 Linear Programming and Matrix Game Equivalence 


(S™, S", A) and a pair of primal-dual linear programming problems. This equivalence 
besides being interesting mathematically, is also very useful as it provides a very efficient 
way to solve the given game G. 

Let us consider the Player I’s (respectively Player II’s) problem: MMos (x! Ay) 


n xegm 


(respectively MaxMin (x' Ay)). Since S” and S” are compact convex sets and for a given 
x m y n 


We shall now establish an equivalence between two person zero-sum matrix game G ~ 


x (respectively given y), the function E(x, y) is a linear function of y (respectively io 
the Mn x Ay ( respectively May (x"Ay)) will be attained at an extreme point of $" 
y xES™ 


(respectively S”). Therefore for a given x € S”, 


ae ST WAR TAg, 
vi (x° Ay) = oe (x Ae;), 


eo. éj = (0,0,...,1,...,0)? with ‘1’ at the j* place, is the j pure strategy of Player 
. Thus 





m 
MaxMin (x! Ay) = Max Mi bi Xi 
xES™ yes ( y) KES™ eee Arka 

i= 
m 
If we now take v = Min a AijXj 


isjen |A } then the maxmin value for Player I is obtained 


by solving the following linear programming problem 







subject to e i 
m 
i a TE. 
i=1 
(4.37) 


Eo TAR di = pa s 
-~ PDramned as a solution of the following 
. 2% : 


T Pesis 





p v 
g > $ T T d 
4 P På a 
ha 
T u 
« = 
=- é 
aa N D 
= . b 
a i a. AS 
' g — 
all ne Še ž - = 
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Min w 
subject to 


GVjswis Ly gi vey tt 
j=l 


e'y = | 


y 20, (4.38) 


n 
ae = Mar (Ea) 

Now it ii be verified that (4.37) and (4.38) constitute a primal-dual pair of linear 
programming, problems. Since both maxmin and minmax are attained, these two LPPs 
have optimal solutions (x and y) and therefore by the linear programming duality, the 
optimal values of (4.37) and (4.38) will be equal. Let this common value be v. Then 

l 


the way (4.37) and (4.38) have been constructed, it is obvious that Aar > 6, it = 


i=1 
n 
1,2,...,2) and Y 49 <7, (i=1,2,...,m) implying that (x) Ay > 7 for all y € S” and 
j=l 
x Ay <0 for all x € S™. 


The above discussion then leads to the following equivalence theorem. 


Theorem 4.10.1 The triplet (x, y, V) E S”X S” XR is a solution of the game G if and 
only if x is optimal to (4.37), y is optimal to (4.38) and U is the common value of (4.37) 
and its dual (4.38). 


Thus, we have concluded that the matrix game G = (S”, S”, A) is equivalent to the 
primal-dual linear programming problems (4.37)-(4.38). 

The pair (4.37)-(4.38) can further be expressed in the form (LP2)-(LD2) where duality 
is much more obvious and it does not need any checking. For this we need to assume 
that v*, the value of the game G, is positive. This assumption can be taken without any 
loss of generality since matrix games GE (GE, S A) and G1 = (Sm s, Ai), Aj = 
 (aij+a), a € R will have same optimal strategies but different values as Uv" and v1" where 
oy" = v'+a. The consequence of the assumption that v* > 0 is that in (4.37) and (4.38) we 
ave v > 0 and w > 0. Now by defining x; = Ji uE GS p2 em j o A 

k et 1 


a ri ae E. E DN Sik . 
BT tal nd ely = a, and Max v (respectively 









wW 











e LER 


Stems (4.37) and (4.38) become 
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" xeR", 0€R* m A= : 
cER Eoi ua R » VER™, A = (aj) is an (m x n) real matrix. 
Now, consider the matrix game associated with the following (n +m +1)x(n+m + 1) 
skew-symmetric matrix 





D Afe 
B=| A 0 -bd|. 
=c pl 0 


Since B is a skew-symmetric matrix, the value of the matrix game associated with B 

is zero and both players have the same optimal strategies. In the following, the matrix 

B will mean the matrix game associated with B and indices i and j will run from 1 

to m and 1 to m respectively. Also a strategy for either player will be denoted by (x, Y, Z) 
where x E R", ye R™ andzeR. 

The following result shows that the primal-dual pair (4.41)-(4.42) is equivalent to the 





matrix game B. 
Theorem 4.10.2 Let X and ¥ be optimal to (4.41) and (4.42) respectively. Let z = 
1 Se 6 ae | 
= —, x =Z2x, y =Zy. Then (x, y", z") solves the matrix game B. 
1+ >, Xit F., Y; 
} i 


Proof. First we show that Z = (č, y*, Z) will be an optimal strategy for both the 
players. For this we note that 

rye PaLye + Lys te = (l+ x+y) ab | 
and therefore (x*, y*, Zz) € cm+n+1_ Now to prove that Z* = (x", y, z*) is an optimal 
strategy for Player II, we have to show that BZ* < 0. | 
But x and y are solutions of (4.41) and (4.42) and therefore by the duality theory 


c—Aly <0 
Ax—b<0 
—clx+bly <0. 


On multiplying these inequalities by z*, we have 
cz* — Aly’ <0 
Ax — bz <0 
—cT x* + bry" < 0, 












which on writing in matrix form gives BZ’ <0. m 
a eons tric and therefore BZ* < 0 gives (Z*)'B = 0, which 


> that B is skew symme 
+ <2 an optimal strategy for Player I as well. 


z*) be an optimal strategy of the matrix game B with 


a > 
= ih 
e - 
pon 


ainei » > 
= = a LA 
oe Ss = 

n i q h 
ma agr“ ? ea Ai AOE Ba) 
=< ae! (ey eye I Ont’ maL 
At u uv “4 af vs dhe bedi al | 
IAA Mg 2 a Raa oe g 

. ~r = « "ra 


>ei 
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same optimal strategies, it is sufficient to t 
(x*, y“, 2") as an optimal strategy for either player, say Player II. Similar argument, è discu 


valid if Z* is taken as an optimal strategy for Player 1. Thereiore, let Z's (yy ig 
be an optimal strategy for Player II with z* > 0. Then we have 


Proof, Since both players have the 


-ATY + cz’ <0 
Ax" — bz’ < 0 
—cTy* +b y* <0. 
Now —ATy* + cz” < 0 gives A’ (£) > c. Similarly the other two inequalities give 


A(=) < b and (=) > pl £) Therefore we have 
ig p A 







whi 
Ahac 
AK SU, 
and 
one rT 
But the first two inequalities imply 
Tx < Y Ax <y b= bT, 
So 
and therefore we have c’x = b'Y. 

This proves that X and y are optimal for the primal and dual problems respectively. 0 

l Thus the equivalence between two person zero-sum matrix game theory and duality 
in linear programming is complete in the sense that given any general two person zero an 

gr matrix game G, there isa related pair of primal-dual linear programming problems, 

and given any general pair of primal-dual linear programming problems, there is al 
associated matrix game B. | a 
y“ 

Example 4.10.1 § 

olve the following (3 x 3) game where pay-off matriz is = 
j ne 
tc 
tl 
Baheti, y 
| d $ pD: 


> ¥O Set the solution of the given gani 


Š Dai 
E n J 
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giscussion of Section 4.10, the LPP for Player I is 


Max U 
subject to 





3x1 — 3x2 — 4x3 > v 
—X1 + 3x2 — 3x3 > V 
-3x1 — x2 + 3x3 > vV 
Xt Xo Xe S |] 
X1,X2,xXx3 = 0, (4.43) 
while the LPP for Player II is 


Min w 
subject to 


3Y1 — Y2 — 3y3 < W 
—3Yy} Ar 3y2 =¥3 5 w 
—4y) — 3y2 + 3y3 S W 
Yi + ya tys = 1 

Y1, Y2, Y3 = 0. (4.44) 


Solving the above LPP’s by the simplex method we get 
xi = 20/45,x3 = 11/45, x3 = 14/45 
y; = 14/45, y> = 11/45, x3 = 20/45 


N vo =w" = —29/45 


Therefore the value of the given game is -29/ 45. Also, x" = A 11/ i 14/ in a 
y* = (14/45, 11/45, 20 /45) are optimal strategies for Player I an S pone s a 
We can also solve the given game by solving the LPP's (4.39) r (4. P u i or ia 
we need to assume that the value of the game 1s positive. Since t o pay-off matrix ; 

fa Vee antee that the value of the given game 1s positive, we nee 
arene ens Se a al umber to all elements of the pay off matrix and then solve 
eae g ae Re iam corresponding to this new pay-off matrix. For our example 


a x ol ~~, N 
by ee L A an ‘ = ees e 
4.39) and 
1 
: i =, 
11 


y- ) rix to get the new 
Senn & to allele ments of the pa off matrix matrix to get th 
TEUGE A ers e Se a 







9] se 
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Now the LPP’s (4.39) and (4.40) corresponding to above pay-off matrix are 


+X, 
subject to 


1 


8x, +2x,+x,>1 
4x, +8x,+2x,>1 
2x, +4x, + 8x, 2 1 
Bs Kosta >p (4.45 
and 
Max Yi h +y 
subject to 
8y, + 4y, + 2y, <1 
2y, + 8y, + 4y, < 1 
Y, + 2y, + 8y; <1 
Yi Yz Y3 = 0, (4.46) 
respectively. 


On solving the above two LPP’s by the simplex method we get 





x, =5/49,x, = 11/196, x, = 1/14 


Y =1/14,y, = 11/196, y, = 5/49 
Therefore the value of the game is v* = (1/(x. +x +x ae ap 
C 1 +X +%3)—5) = (1/(y, + y, + y,)-5)= 
196/45 — 5 = -29/45. Further the optimal strategies for Player I ood tt ove 
X1 = 20/45, x5 = 11/45, x3 = 14/45 
and 


yi = 14/45, ys = 11/45, ys = 20/45 


/ 


Lit ee 


lá 


respectively., Here for i,j = 1,2,3 we have x* 
S — : = i 






Oy ee, r S t 
p +e Tie D SORT Gy y Re ee ee T cae oy 
PAT 34S Wem t/DnTNI7 ‘hip ane 
e A 6 hd JOULUN b LE ma 


- i 


= Dain a4 


ae corresponding to the primal-dual pair for the 


and also sol 


Solution T! 


As c= (4, 3 
primal-dua 


Let us rec 
its dual is 
of the gar 
Player I a 


pee, 


4.11 Sı 


e This < 
from | 
Sectic 
tary s 
econo 
tatior 
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Max 4x1 + 3x2 
subject to 
Xi +xX2 <8 
2x1 + X2 < 10 
Wi eae (4.47) | 
| 
and also solve the game so obtained. | 
Solution The dual of the given LPP is | 
Min 8x1 + 10x2 | 
subject to 
W1 + 2w > 4 | 
W1+W2 <3 | 
W1,W2 2 0. (4.48) | 


1 
As c = (4,3)', b = (8,10)? and A = i the matrix game corresponding to the given 
primal-dual pair of LPP’s is 


O° 0 -1 <9" 4 
o o =< =e ae 
Peis 1s 0. so 38 | 
2 si o OF 10 | 


toto) © Ona Olu 


its dual is (w = 2, w; = 1). Therefore we can use Theorem 4. 10. 2 tO obtäin the solution 
of the game B. We note that z* = 1/(1 + 8 + 3) = 1/12. Therefore optimal strategy for 
Player I and also for Player II is QE = ZX ny; = = zy;) fori = 1,2 and j = 1,2, i.e. Ca = 1/6, 


he pe is =G = =I z= 1/12) Further the value of the game is zero. 
Rega 


Let us recall that the optimal solution of the given LPP is (x) = 2,x5 = 6), and that of 
| 
| 









ey and Additional Notes 


ter pertains to ie Tape of duality theory in linear programming. Apart 
rovine various di y th eorems Aa Sections 4.2-4.5, a separate section, namely 
a nd cert air petal aons of the complemen- 


a a 


wn that if the parame f giver LPP anve certain 
+ A, 7IVTEN a Meaar ae ee PC | 
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(3) Max Ay Xo Xa = Fa 
subject to 
Osx < 8 
-4<% <4 
—2<5%3<4 
Os%s W, 
(4) Min X1 +X 
subject to 
xi +2%) = 6 


7X1 + 8X2 < 56 


4.2 Write the dual (D) of the following LPP (P) 
Mar 9x, + 2x2 + 3x3 
subject to 
x1 + 5x2 + 2x3 = 30 
xı — 5X2 — 6x3 < 40 
X1,X2,x3 = 0. 


Solve (D) by solving (P) and also verify the complementary slackness theorem. 


43 The initial simpler tableau of a LPP in the maximization form (no artificial vart- 
ables are required) is as follows 


| y” yee y®) y) y®) y 
X5 = 6 + 9 7 10 1 0 
1 1 3 40 0 1 


= -20 MN 0 















1. Write (P) and (D). 

2, Solve (P) by the simplex metho 
problem. 

3. Verify the complementary slackness theorem. 


d and hence find the optimal solution of the dual 


4.4 Write the dual of 
Mar xX- X2 + X3- X4 
subject to 


X1 + x2 + X3 + x4 < 10 
0<x, <8 
-4<x2,<4 
—1<%%=4 

0< x4 < 10, 
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tse the simplex algorithm to verify this 


claim wi P 
duality theorem to be used in this regard. am without writing or solving (D). State the 


4.9 Consider the following LPP (P) 
Min 2x1 + 15x 


i 2 + 5X3 + 6x4 
subject to 

Xy + 6X2 + 3x3 + x4 9 

2x4 — 9X2 + x3 — 3x, > 9 

X17 X2, Ka X4 > 0. 

1. Write the dual (D) of the above LPP (P). 

2. Solve (D) graphically. 
3. Utilizing the information în (ii) above and various theorems of duality, obtain an 
optimal solution of (P). 


4.10 Solve the following LPP by the dual simpler method 


Min xı +2% + 3x3 + 4x4 
subject to 


X1 + 2X2 + 2x3 + 3x4 > 30 
2X1 + Xo + 3x3 + 2x4 > 20 
Mn, X38, XA = 0. 
4.11 Consider the LPP 
Mar 5x, + 12x7 + 4x3 
subject to 
xı + 2x2 + x3 < 10 
2x1 a T 3X3 = is! 
MI XD, XB = 0) 
Without using the simplex algorithm, find the optimal solution of the dual, given that 
x1, X2 are strictly positive in the optimal solution. 


4.12 Using simplex method, solve the following linear programming problem: 


Maz 3x1 + 2x2 
subject to 
xı +2x2 < 6 
2x1 + X2 <6 










Miyano, 0. 
ippend an additional constraint xı + x2 2 3 in the optimal simplex tableau. Does the 
inal optimal solution remain optimal? 
| ginal optimal simplex tableau if an additional variable x3 with activity 
3 = 2 is introduced in the original problem and find its 
Me unt A a p l 


"f T 


* 
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4.13 Consider the linear programming problem 


Max x1 + 6X2 4 
subject to 

X1 +3XxX2 < IZ 

Ox] F Xz S 12 

xy + x2 =13 


Yipes 20, 


and let following be its last tableau 


y® y® y®) y) y®) 


ic ee On ar 3/8 1/81 =o 
resi O 1/8. -3/8 0 
0 1/4 -3/4 1 


. What conclusion can you draw from the tableau? 

. Update the table if b is changed to (12,12,6)', what do you infer now? 

3. Identify the redundant constraint (if any) and hence find the new optimal basic fea. 
sible solution of the reduced problem? 

4. Does the addition of the constraint xı + x2 = 3 in the reduced problem affects the 

solution. If yes, then find the optimal basic feasible solution. 







nN u 


4.14 Consider the following LPP problem and its optimal tableau: 
Maz 2X1 + X2 — X3 
subject to 
X1 + 2x2 + x3 < 8 
=X + Xp — 2x3 < 4 
Rone 0) 





y) y” y®) y® y® 





1. Write the dual problem and 
2. Find new optimal solution 
from 1 to 6. 


p — EY p Y a eons ‘ 
_ 9 Find new optimal solution if th ffici 
ae coejficien 
JTOM <6 UO =. 
s ne e \ om a4 ae E m d = 
1 A e + nall ; Pret: ay pe ae pds ERa 
wal Wilt VE the chanae an tho amtin.}] _ 


i find the optimal dual solution from the optimal tableau 
if the coefficient of x in the objective function is chang 





of x2 in the first constraint is chang 


Scanned by CamScanner 


Scanned by CamScanner 








150 Numerical Optimization with Applications 


ve ; ‘answer 
4.18 Are the following statements true? Give reasons for your 


1. The primal LP (P) and its dual LP (D), both cannot have eres a solution, 
2. The primal LP (P) and its dual LP (D), both cannot be pri e. ; 
3. The primal-dual pair of LP’s can be solved by solving a system of linear equation, 
non-negative variables. 
4. Da ual/dual/dual)) of a LPP is the primal LPP. 5 
5. If the primal LP (P) has unique optimal solution and the dual LP (D) is feasit T 
then (D) also has unique optimal solution. 


4.19 Consider the LPP 


Mar (cTx-b'w) 


5 
subject to ; 
Ax <b ; 
Alw>c | 
t 
Ge 0). 
Show that the above LPP is either infeasible or its optimal value is zero. f 
€ 
4.20 Show that the system ATw = c, w > 0 is inconsistent if and only if the system ] 


Ax <0, cx > 0 is consistent. 


4.21 Use linear programming problem to solve the matrix game 


mB 3S7 
2 5 4 -6 


4.22 Solve the following game by the simplex method 





lee 2 
6 
2 | 
T hs the game corresponding to the primal-dual pair of LPP’s where the guven 
P is 


Min XI T 2x 
subject to 





5 


The Transportation and Assignment Problems 
O OREN = 


5,1 Introduction 


In this chapter we study two special linear programming problems which have tra- 
ditionally been applied in many real life situations. These problems are the classical 
transportation and assignment problems. One key feature of these problems is that, in 
general, they require a very large number of constraints and variables and so a straight 
forward application of the simplex method will turn out to be highly inefficient. How- 
ever, the coefficient matrices of both the transportation and the assignment problems 
have a very special structure. Because of this specific structure, it is possible to develop 
a very simplified version of the simplex method for solving these problems that achieves 
dramatic computational savings in its implementation. We discuss these details in the 
subsequent sections. 


5.2 The Transportation Problem : Description and Mathematical 
Model 


In this section we first describe the single commodity cost minimizing transportation 
problem and then obtain a mathematical model of the same. For this, let us assume 
that a single commodity is available at each of the m sources (stores) in 41, @2,.--,4m 
units respectively. This commodity is required at each of the n destinations (shops) in 
bı, b2,..., bn units respectively. Therefore the commodity has to be transported from 
E A a ett | = 1,2,...,n) be the per unit cost of 


sources to destinations. Let Cij (i = 
transportation of the commodity from the ith source to the j!” destination. In the cost 


minimizing transportation problem, our aim is to find how many units of the given 
commodity should be transported from each source to each destination so that the total 
cost of transportation 1s minimum. 

To get a mathematical model of the above problem we have to first decide the total 
number of decision variables. Since there are m sources and n destinations and we have 
to find the units of commodity to be transported from each source to each destination, 
there are mn decision variables, Xij- Here Xij denotes the units of commodity to be 


transported from the ith source to the j} destination (i=1,2,...,m;j =1,2,...,n). 
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m n 





| Min = 
5 hhm 
= ]=1 | 
subject to | 
n 
yt Sa, (i= 1,2,...,m) 
j=1 
m 
RE G =1,2,...,n) 
K 
%ij 20 (=1,2,...,m;j=1,2,...,n). (5.5) 


Looking at problem (5.5) 
constraints are linear fun 
problem and therefore c 
as we shall see in the ne 


above, we observe that here the ob jective function as well as all | 
ctions of decision variables Xij so this is a linear programming 
an certainly be solved by the usual simplex method. However, 


; si Section, it is a very special type of LPP where the coefficient 
matrix A has a very specific structure. Therefore we expect that the matrix A may have 


certain nice properties which can possibly be exploited so that the implementation of 
the simplex method for the transportation problem is much more simple and efficient. 
This is really true and it will be clear as we proceed with our discussion. 


5.3 The Balanced Transportation Problem 


The transportation problem (5.5) is called the balanced transportation problem (BTP) | 
if all the source and destination constraints hold as equations, i.e. | 


m n 
Min Z = ne Da CijXij 


i=1j=1 
subject to | 
n | 
ya AS ENT), 
feet | 
” | 

> = (pa aay 

i=l 


Xij >0 alee th) LZ, pith). (5.6) 


= \ 


from (5.6), it is obvious that if it is feasible then total quantity available should 
OF Ar ais a i š s . m — n rc . - : l 
RG) tk zotal © uan sty required, Le. Ya =) aj = ae =| bj. What we will like to 
ak Daaa aA EEN . : m E n ol j ” 
t the converse 1S aiso true, 1. if pH 2A = 7 J= b A a, then 


ee er hao takmo Y-: = ( -þ; | fc all i and 

1 verlry tne same Vy vabits Xij vr (ai pl a OT au 1 AI | 

< í » £ 
Sacit mes -g 


nJ 
= 
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denote them by s, (i = i 


3 È .,m), Simi ' | 7 
destination constraints, so we »M). Similarly, the last n rows of A came from 


-=1,2,...,m"). Further in eac} call them destination rows and denote them by 4”, 
thes | ach of the mn columns of A there are two ‘ones’, one ‘one’ 


ing because of the source rows ar 
ge as. diss oe rows and the other ‘one’ coming because of the destination 
‘siete Tahia n ti a c structure of A, the following properties are possessed by the 
matrix A which we state in the form of following lemmas 


Lemma 5.3.1 Rank A <(m+ n). 


Proof. As neither m = 1 nor n = 1, (m + n) is always less than mn. So we shall prove 
that Rank A # (m + n). This follows clearly because 


m 
e O 


T= 
and 
n 
Y aO ER, 
j=l 
which gives 
m n 
Y0- Y a =0. (5.8) 
i=1 j=1 


rows of A are linearly dependent which proves that Rank 


But (5.8) means that the 
O 


A<(m+n). 
Pama 5.3.2 RankA = (m+n— 1) 


Proof. The proof follows directly from (5.8) because on the left hand side, all coefficients 
are non-zero, sO any row can be written in terms of the remaining (m +n — 1) rows. In 


fact. this also shows that if we delete any row from A, the remaining (m +n — 1) rows 


are linearly independent. 
We can also give a more explicit proof by act 
submatrix oi j i Zero. 
= submatrix of A whose determinant ip A % 
F = {n™ column, ont*column,...,mn column, 


ually constructing an (m+n—1)x(m+n-1) 
For this we first construct the submatrix 
tcolumn,...,(n — 1)*column} which is of 
m +n) elements. Now we delete 







pe 


P = 


€ tY je y (m 1. n) SA m or n — 1) as eV 
N ; ae gs) m Wu 


th e e f aa, Trl Y 4 pe 
WHA ee Na 3.) an O À - 
yn row from F LY 595 sii) 





Scanned by CamScanner 




















EE 
-+ a 


—————————— 
a e 
À + ® a 
EN SR NT NM 








‘on 
ETAT up’ 
aip . As 
00 TiO er U lin or 
BOM tcaucc seh na tenies B peis use (5.9) 
00... 0a 0 : hna 5, 
OR OTe 
00 .* x 
ere er A 
KoE ON ere th 
denti i fore (5.9) gives b3 
where, J, denotes the (k x k) identity matrix. Therefore (5.9) g G 
IDI = mln = 1 # 0 D ad 
Definition 5.3.1 (Unimodular Matrix). Let M be a matrix whose entries are 0,1 wW 
and —1 only. Then the matrix M is said to be unimodular if the determinant of every hi 
square submatrix of M is either +1,-1 or 0. 
Remark 5.3.1 Jf the entries in M are 0,1 and —1, it does NOT follow that M is b 
unimodular. For this we can take the (3 x 3) matriz p 
I ih (0) 
MSE O 
1 -1-1 S 
and note that |M| = —2. So it is not Just the entries (they in any case have to be 0, +1 
or -1), but the way they appear in M, i.e their pattern is important. rr 
Lemma 5.3.3 A is unimodular 
y 
Proof. Let A be any (kX k) submatrix of A. Looki 
- Looking at the Structure of th trix A t 
we note that for A, exactly one of the following holds DE S 
1. there is a column in A; having all entries as ‘Zero’, 


2. every column of A; has exact 
7 3. there is a column in Ay whic 


on. {VS ee n e AN ES 
Lor the case 1). clear 
wm bae case(i), clearly |Ay| = 0, while f 

source row and the anni. nue for the case 
Yee TOW and the another ‘fnna? ) 

ua enotner ‘one’ has to | | 

(mle _ ~€ nas to come from 

Í we Sllm Ovar w a ra R Seb sa x A j ’ > 

um Over the source rows of A; a 

atte A NAA th LO Jal I AL A. 

> OLA 


K = A A. L x 
j OS FE. 
< Sonepat ~ 
y ~ 
R3 = 


ly two ‘ones’, 
h has exactly one ‘one’. 







_“*tore the rows of A; are linearly 
reins” ; | ay 
iw” 
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we next consider the case (iii), ie 
| (ii), ie. there is a column in Ay which has exactly one 


> We can therefore open the det ate’ 
: R , “ramant = ; 
the location of this ‘one’, Now, the of Ay and get |A,| = +|A,_|, depending 


either |Ax—a| = 0 or JAg] = Apal made arguments hold for Ay; as well and this 
: EE = k-21 Continuing in this : Di 
A -1 as all entries in A are 0 and 1 ants 8 in this manner, we get |A,| = 0, je 


5.4 Consequence of Unimodularity 


As the rank of the matrix A is (m+n — 
iş convenient to take a double suffix n 


the (i- 1)n + j)" column of A. Thu 
by pi2 and so on, the (mn)*" 


1), there will be (m+n-—1) basic columns. Here it 
otation for the columns of A, i.e. by pij we denote 
S the first column of A is denoted by p11, second 


i column being denoted by pm». Infact, we may consider an 
(mxn) symbolic matrix having (m x n) cells, each cell representing a column of A. So, 


K Ow a t} = , À 

in general, the (7, J)" cell k the symbolic matrix will be identified by the column pjj 
which is really (i — 1)n + j)" column of A. If we further denote by e; as a vector in R””*" 
having all entries as ‘zero’ except one entry being ‘one’ at the i” place, then 


Pij = ei + Cm+j, (5.10) 


because every column of A has exactly two ‘ones’, one ‘one’ at the i place and the 
other ‘one’ at the (m + j)'" place. 
Now let te denote the (m +n -— 1) basic columns of A. Let pij be any other column 
of A. Then pj; can always be expressed as a linear combination of basic columns, i.e. 
symbolically l 
a (a,b) (B) 
pas yA Yij i Pap (5.11) 
(a,b) 
where the scalars yj EP are exactly same as the usual y;; in the simplex algorithm. 
Now because of the fact that the coefficient matrix A is unimodular it follows (as 
we prove below) that the scalars yi?) are +1,—1 or 0. Also there is a very simple way 
to exactly find out that which y;;“) is +1, which is —1 and which is zero. Here the 
symbolic matrix becomes very handy because it allows the whole implementation in an 


(mx n) transportation tableau only. | 
If we now recall the following formula from the simplex method 








XB, : 
xp, -— Yi (#7) 
Yrj 


XB; = XB 


5.12 
— , =r), o 


Yrj = 1 because for the transporta- 
yerefore, for the special case of the 


if » 


Near = 
` U n Te A 

J ALA aae = i 

ryn] ~ m 

t RE CIOL se 

> ae tantra rae 

7 ad 

TR 
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, XB; J5 XBys Vij rin 5. f : 
3 XB; 7 Yij F 
‘., Because for the tra 

= XB, l t observation. i Tang, 

for je Sepa 13) above, we make an importan avery coefficient yi) is 4] 

O i | 


i ‘< unimodular, a a 
io ae Bes aae there is no division / multiplication 
D n Apia NIA operations of addition, subtraction and com. 
n 


plex algorithm for the transportation 


portation problem the 
—1 or zero(see Lemma eerie 
in the whole algorithm, 1t 10V | of 
parison. This makes the implementation of the s1 


problem extremely simple and efficient. 


_ 1) basic columns of A. Let pij be any 
Lemma 5.4.1 Let (Pap P) be a set of (m +n ~= 1) bas 


other column of A. Then (af) 6 a 
Big X Yij Pap) 7 ( 14) 
(ap) 
where each T ) = +1, —1, or zero. 
Proof. It is obvious that pij can always be written as a linear combination of basic 
columns. So we shall only prove that es = +1, —1,or zero. Let Yij be a vector whose 


(m+n-—1) entries are rae and L be the matrix whose columns are ae Then equation 
(5.14) can be written as 

Pij = LYij . (5.15) 
But in (5.15) there are (m + n) equations in (m + n — 1) unknowns because pi, € Ra 


eee) end ERED Therefore, we delete one of the rows from (5.15) 
say in particular the i row. Then (5.15) becomes 





Cm+j—1 = TY; (5.16) 


where T is the matrix obtai | 
ained from L by deleting the i" row and aS pij = ej + lmj 


Attor qth : 
à e bi from Pij» the right hand side of (5.17) becomes em,;-1. Als? 
D is the matrix of basic columns (after deleti ir J 1) 
E exists. Therefore from (5.16) we get eleting the i” row from 4): 


Te 
$ 2%, 











lege ee ee ee 


> E 
Q 
a n J 


3 S tl y ~ z -$ ~ k 7) 
A t iav E n 1 

i = + i (5. 
EOE P Pr K afia 
car aE T 
Mi Ol i ¢é 


i p f. a 
S DE) ' | j ; * ‘ 
and |T]. As the matrix Å" 
om Cotactors of T are +1, he 





Ga 
= in, 

Men 

= 


3 
` 
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But T is invertible, so |T| = +1 or <1. Therefore each of these ratios i.e. cofactor 


of TAT are +1, —1 or 0. This proves that all entries in T~! are +1, —1 or zero. So 
Y= +1, -1 or zero. This proves the lemma. o 
ij 
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Now from the application point of view, it is important that we are able to determine 
the entries of the vector Y;j. Although it is known that all entries of Yjj are +1 or —1 or 
we still do not know how to determine them. In the simplex method the scalars Yij 
obtained by actually evaluating B-'a), but in the case of the transportation prob- 
the method is extremely simple. We first illustrate the method below for Example 
5.4.1 and then give its justification later. Here the role of the symbolic matrix 1s really 


important. 
Example 5.4.1 Consider a balanced transportation problem with 2 sources and 4 desti- 


nations. Determine the coefficient matriz A and find the expression of a nonbasic column 
in terms of basic columns. 


Solution Here m = 2 and n = 4. Therefore the coefficient matrix A will be of order 
(6 x 8) as given below 


So Q AIO TtT 
O QOQ HO QO H 
OAO Q O - 
HO OOO ka 
O OO oO 
© O m C.F © 
oroeoedcr © 
AO OO a 


The first two rows of A are the source rows and the last four rows of A are the 
destinati -Also the rank of A is 2AN N=). 
O cs notations, the eight columns of A are denoted by p11, P12, P13, P14, P21, 
P22, pz, and p24 respectively. This gives the symbolic matrix as 


pa 2 Pph Pu 


Ea. po pz P23 Pz 
Sri j j dent 
T and p24 be five linearly indepen 
ei = olumns p11, P12, P22 P23 
oda hee te +n 1) pasio columns. For the time being, we accept this 
columns of A, i AULD TET fe j i 
a = a ia : ois method to find initial basic feasible solution a — = 
aS we have not yet learnt tile Hi | , iden as 
a ia rtation problem. The (m+n -—1) basic columns po are Í- 
cs in te abolicmatrix by encircling them as shows 


ERIN 
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bolically, a typical loop will look 4; 
we note that if the (k ag 1)" t _ (p, q), (p, r), (5, r), (5, t), wae? (v, w), (v, q)}. 
Here | and k" cell are in the same row, then k" and (k + 1) 
cell must be in the same column, but in different rows 


We now state following results without proofs 


Result 5.4.1 Every loop must have even number of cells. 


Result 5.4.2 Any set of columns of A are linearly dependent if and only if the corre- 
sponding cells in the symbolic matrix contain a loop. 


Let us again consider the situation when we are trying to express a nonbasic column 
age erms of (m +n — 1) basic columns (Pap? )}. In view of Lemma (5.14), we have 


Pij = $ pag” 3 
(aB) 


which because of the fact that pj; = e; + em+j tells that the number of basic cells on the 
RHS must be odd with ‘+’ and ‘-’ signs alternately, i.e. 


Pij = Piu P) — Dae + Dy”) —-...- Dus) of Poi”? (5.18) 


Since existence of loop means linear dependence, and every nonbasic column can be 
written uniquely in terms of basic columns, there will exist one and only loop for the 
nonbasic cell (i, j). In a certain sense it is the result which says that every vector of a 
finite dimensional vector space can be written uniquely in terms of the basis vectors. 
In (5.18), as a matter of convention, it is understood that those cells which are not 


in the loop, for them yj; value is zero. 


5.5 Optimality Conditions / Stopping Criterion 


In the usual simplex algorithm we stop when all z;—c; 2 0 (for the maximization case). 
So the next objective is to find out the analogue of (zj — Cj) for the transportation case. 
For this we write the dual of the given balanced transportation problem (5.6). This dual 


* 


S 


m n 
Max þaut $ bjj 
j=l 


i=1 






Se Eee eee Oe aay 
Specs VO = -= 


te = L2) 


nr 117i CIC ( 
aa’ PC Tt TY STE, Tit SIV il. 
? Ii ESA t{,uWU > a a 1 Toe 
tTitil OVUL r eee — 
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North -West Corner Rule 

ii) Method of Column Minima 

(iii) Method of Row Minima 

(iv) Method of Matrix Minima 

(v) VAM(Vogel’s Approximation Method) . 


















In the following, we describe each of th 
shaM e above methods and 
(i) North -West Corner Rule 


Let the (m x n) transportation tableau be given as 


illustrate through an 


E N A ae b, 





We start from the north-west corner of the tableau. We then compare a; and Jj, i.e. 
find the min(a4, b1). If this minimum is a; then we take x11 = a; and cross the first row 
because there is nothing is left at the first source. However if min(a4, b1) = bı then we 
take x1, = b; and cross the first column because the requirement of the first destination 

is fulfilled. | 

We next consider the north-west corner of the resulting tableau after the first source 
has been deleted and b4 is updated to bı — x1; (this will happen when min(a1,b,) = a1) or 

the first destination has been deleted and a is updated to 41 -X11 (this will happen when 
min(a,,b,) = by, and then repeat the procedure as described earlier. As the probi is 
balanced pa p4 = De m i) in the end (the north-west corner me of the final resulting 
lities and the requirements will match exactly and 


for that cell. 
manner, we can prove that the corresponding 


cells (i, J) in the symbolic matrix will constitute a set of basic cells or equivalently the 
: columns pjj 10 A will be linearly independent. Further, in the absence of 
= — Al] be ; oxactly (m +n- 1) such basic cells and so the corresponding 
E E sve the basi trix for which the solution so obtained is 


G NY ~ | N a q í l i P ) 
e the Dasls Mile 


transportation tableau) the availabi 
that common value will be taken as %; j 
Once we determine values x;j in this 


LCArrocn nn Aa 
L VITES D © ' n ouha pe 
A ATTO 
= i 






0 


arting solution to the fol- 
BES : 
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Solution The given problem is balanced because },;- 4 4i = aes , bj = 45. Now we start 


the N-W corner rule and compute min(a1, b1) = min(10,13) = 10, so we take x11 = 10, 
delete the first source and update the b1 = bı —x1; = 13-10 = 3. This gives the resulting 
tableau as 









t destination and update a> to 
as 


Again min(12, 18) = 25 
~ b = 18 2 = 6, Phe gives the r 





= I a 
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Further, in the absence of degeneracy these basic cells will precisely be (m+n — 1) in 
number, thereby giving (m +n — 1) linearly independent columns of A. In this example 
the columns py, pa, P22, P32, and p33 are five linearly independent columns of A. 

Though We are not giving a formal proof, it can be verified that these columns of A 
are linearly independent because there exists no loop amongst any subset of five basi 
cells as identified in the symbolic matrix. 3 


umn minima, row minima, 


? 


i and then choose the 


€ requirement 
Otumn and continue the 


fi, update b to bı -=x 
repeat the procedure. Once th 
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E (iii) Method of Row Minima oe >. | 
= This is exactly sam | : 

| oe Deluna? o “> i enon = column minima, except that we change the 
lowing starting basic feasible Solution 6.1), the method of row minima gives the fol- . 





A. (iv) Method of Matrix Minima 
This method is again very similar to the earlier two methods, namely ‘column min- 
ima’ and ‘row minima’. Here we choose the smallest cost element (say cij) in the whole 
matrix C and find min(q;,b;). If min(a;, b;) =a; then we take x;; = aj, drop the i" source 
and update b; to bj —x;;. Similarly, if min(a;, b;) = bj, we take x;; = bj, drop the qe source 
and update a; to a; — xij. We next find the smallest element in the resulting cost matrix 
and continue. 
For the example under consideration, the method of matrix minima gives the follow- 
ing starting basic feasible solution 
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on Method) 
d column d 


ifferences for each row and col. 
(column) the row difference(column differ- 


ence) is defined as the difference of the smallest and the next smallest ie S in a 
row(column). Once these m row differences and n column differences are > We iad 
the maximum of these (m + n) numbers. If this maxımum falls for ar ow(say ! | row) 
then we do row minima in the i" row and then delete the itt source. If ee minimum 
falls for a column( say jf" column) then we do column minima in the jJ f column and 
then delete the j column. We next compute the new row and column differences for 
the reduced cost matrix and continue the procedure. This method is best suited if the 


row differences and column differences are distinct. 
For the example under consideration we obtain the following starting bfs using VAM 


(v) VAM(Vogel’s Approximati 
Here we first find the row differences an 
umn of the cost matrix C. For a given row 


13 18 14 Row Differences 





10 1 
15 1 1 
20 i i l 
Column D 1 1 
differences 





P 4 A 
a i TE 4 ° 
. we ~ , 1 we 
rq rO. EP É 
d pea S TRST A Fg) 
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aj E 
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5.7 The Complete Algorithm 





ht We pow have the complete machinery to describe the transportation algorithm for 
ral | solving the balanced transportation problems (5.6). As explained earlier, this algorithm 
me is essentially the simplex algorithm whose implementation gets simplified due to the fact 

that the coefficient matrix A is unimodular. The whole algorithm is essentially based 
on the operations of comparison, addition and subtraction; and there is no operation 
of multiplication/division because al] yO = 44, 1 0, a property which is a 


consequence of the unimodular property of the matrix A. The stepwise description of 
the algorithm is as follows. 


Step 1 Consider the (m x n) transportation tableau 





| ae | and obtain the starting solution by employing any of the methods described in Section 
ve 5.6 (Step 1 will not only give the values of basic variables but also identify (m + n — 1) 


f basic cells in the symbolic matrix). | eat 
Step 2 Determine (m + n) scalars(dual variables) u; and v; such that ui + Oj = Cij for 
basic cells. As explained, here the value of one unknown is to be chosen arbitrarily and 


in j e choose u1 = 0. ane 
sedi (uj+0j- Cij) for each of the nonbasic cells. If all (uj +vj-— cij) < 0, then 
the Re i iaon: is i and therefore we stop. However, if some (u; +vj— cij) > 0 


then the current solution is not optimal and we go to Step 4. 


rá 2 i — Cys). The 
agai a f (uj + 0; — cij). Let this be (ur + Vs Crs) 

= Step 4 Choose the positive most value o 3 og 
ae a basic cell and the corresponding etant Prs will enter the basis. 
a a eona ese the representation of (f,s)”" cell in terms of the basic 
ee | Trix, n omy ; . eF i ose the mi si xP) Over those (i, j) for which 


ts iP) z 
> basic variables (x;j‘"’) for which 
EI EARI AN eg z Cua J 
e < é 
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170 Numerical Optimization with Applica i] (u,v). Then the cell ( | 

| in the loop. Let this ada e S singe will leave the basis, k. 
jigi AR. E ll and the corresponding Com tion is obtained by using the | 
oman oe fe nd go to Step 2. The new solutio s 
Step 6 Find the new D.I.S a 


u, v) / 
formula 0 T oeg cup), (ar B) F ( 


(B) _ y _ (B) 
+ oie (r,s) cell, it is added 
Bis si hifted at the (r, s)” cell, ed to 
(a8) = +1). Those cells (i, j) 
(ap) =0 


i.e. in the new tableau, the value Wiig las 
all those cells in the loop for which (-) sign 1s attach 


all those cells in the loop for which (+) sign is attached al a 
which are not in the loop, the value a” remains unchange 


Remark 5.7.1 As the transportation algorithm 15 essentially the simplex oor 
we will obtain optimal solution in finite number of iterations. Here we may note that if 
cij > 0 then the transportation problem cannot have unbounded solution because it is q 


minimization problem and Yi? 1 Lui = 1 ijXij 2 0 for Xij feasible. 


Remark 5.7.2 We make a very important observation about the transportation prob- 
lem. If a; (i =1,2,...,m) and b; (j = 1,2,...,n) are integers then the starting solution 
is also in terms of integers only because there we are essentially comparing two integers 
to find min(a;, b;). In the subsequent steps, as all yi; are +1, -1 or 0, there is no divi- 
sion/multiplication in the algorithm and each time either two integers are being added 
or subtracted. Therefore the initial b.f.s as well as all subsequent b.f.s are in terms of 
mtegers only. This shows that the optimal solution is also in terms of integers only. 


The main reason for this property to hold is that the coefficient matrix A of a balanced 
transportation problem is unimodular. 





We now illustrate the working of the transportation algorithm with the help of some 
examples. 


Example 5.7.1 Solve the followin 
g transportation proble j ) 
lution using the north west corner rule problem by finding the starting s0 A 
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Solution The first iteration of the algorithm is 


M, | Step 1 The north west corner rule gives the following starting b.f.s. 
a} 13 18 14 

10 

15 

20 





Step 2 Find u1, U2, U3, V1, V2, % such that uj + vj = Cij for each basic cell, i.e 
u +v = 2, u +01 = 6, Up + V2 = 3, uz + V2 = 2 and u3 + v3 = 3. Taking u = 0, we 
get vı = 2 Up = 6-2 =4, Up = 3-2 =3-4=-1, uz = 2 — v = 2 — (—1) = 3 and 


13 =3-u3=3-3=0. 
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10 


15 


20 


i e are Uy + U2 — C12 = 
xt compute Uj + v; — cj; for each nonbasic cell. TER Tona 
o e ite these numbers in the rig O 
=3+2-4=1. We write these 
u3 + V1 — C31 = 3 + 








13 


Step 4 The pos 
_ the basis and 





i `“ 
Yi TY e le alee 
ULLUINN LO les 






the current sol 
most value of Uj + 0; 


ution is not optimal. 
~ Cij iS U2 + v3 ~ C23 = 2. So column p23 enters 





-> IVR f taS e G » al 9 À f 
wSUiValently th > basic cell which be 
er 2 S 
7 D o n 
ve g TA P 
sJ 
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1) P12 P13 


D | 
poa 


Jf there is no confusion then we make the loop in the tableau itself but we must 
remember that loop is in the symbolic matrix only. Therefore making the loop in the 
tableau itself we get 





13 18 14 


A 10 

i 
15 
20 





v=2 m=-1 v=0 


loop for which ‘+’ sign is attached(i.e yj; value is 
+1) and choose the one for which x;j is least. Here it is x22 = 12 so cell (2,2) becomes 


E -aø column P22 leaves the basis. p fee 
nonbasic and the corresponding = the usual minimum ratio criterion which is 


] at it is esse 

Here we eax es pa ae which is going to leave the basis. We recall that the 
being used to ide cee 

-~ usual criterion is 


e z 
@ 


Now we identify those cells in the 
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xp, = min (xp; ‘Yi = +1}. 
i i 


double suffix notations of the transportation 


have ; 
point can be understood much more 


| 
In the above, of course we shoulc 
problem but we have avoided that so that the real 
ew transportation tableau. Since the 


] - J nd | |t b 


Step 6 We now find the new so l ie 
i cells (1,1) and (2, 1) are not in the loop, values of the aaa be A 
will not change, i.e x11 = 10 and x21 = 3, Also cell (2,3) is going vo Ge Oe O 

value of the new basic variable x23 wil] 


cell and (2,2) is going to be nonbasic cell, the ' : 
be the same as the value of the leaving variable x22, 1.€ *23 = 12. For tee" cells 
in the loop, the value x22 = 12 will be added to all those basic cells for which ‘-’ sign 


is attached and subtracted to all those basic cells for which ‘+ sign is attached. This 


gives the following tableau 


10 
15 


20 





We now go to next iteration 
the above steps in the same tabl 
the subsequent iterations 


5 poe the procedure. In practice we preform all 
: “or our example, we get the following tableaus at 





Scanned by CamScanner 





Scanned by CamScanner 









176 Numerical Optimization with APP 


10 
15 


20 


10 
15 


20 


Now in the last tableau all (u; 
solution namely p= 10) of 
and 2" = (2 X 10)+(8x1)+(2 
Example 5.7.2 F 






aie wy Pe ee a 
ong tne Starting so 


iA | BEZA TTo 
Länne | I/A i 
tUJ a Tes i YÀ / £ 


or the problem o f Exa 


road de Ds fe 
Uton USING 





lications 


x 14) + (4 x 







I > f J F 
SS] PT Cx Af Ff fy Ly a} 
“I he Ob" AA N 
TEA 
- B 
7 E _ 


7 


= cc at. 
2 = 1, x, = 14, xt, = 


3)+ (2x1 


mple 5.7.1, obtain the op 
WENA (i) row minima (iii) matrix minima 
ş E 
‘lyre 3% 


l + ow -h (A 
usino - aal MMS 
Jile Í 
a Já 


+; een MY enn TER em we 
~ Method of column minima and get 


Gre 


a 





+vj—cij) are non positive 





and hence we stop. The current 
3, X3 = 17 is an optimal solution 
7) = 97 is the optimal value. 


timal solution by find- 


7 
> i h 


a 
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13 18 rja 





So the current solution is optimal. 


Remark 5.7.3 In general the methods of column minima, row minima, matrix minima 
and VAM will give a starting solution which will be ‘nearer’ to the optimal solution than 
the one given by the north west corner rule. Therefore if we find starting solution by 
the methods of column minima/row minima/matrix minima/VAM and then solve the 
problem we expect that we shall take lesser number of iterations than the one when we 
find starting solution by the north west corner rule. This is because, as explained earlier, 


the north west corner rule depends on a; and bj only, whereas the other four methods 
depend on aj, b; as well as on Cij. 


9.8 Degeneracy in the Transportation Problem 
To understand the degeneracy in the transportation 


. problem, let us consider the follow- 
ing balanced transportation problem and find its Starting solution b h 
west corner rule to get the tableay A Py using the nort 


— 
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i 
10 : 
15 
20 |l 
Here we note that in the symbolic matrix we have only four basic cells where as we 
need (m +n- 1) =3+3-1 =5 basic cells, i.e. | 


P12 P13 | 
p21 P23 | | 


This indicates that in the transportation tableau there is one more basic variable at 
the zero level which has got mixed with other nonbasic variables which are any way at 
zero level by definition. Since one basic variable is taking the zero value, the current 
b.f.s is degenerate. But before we start performing the usual steps of the transportation 
algorithm we need to identify that basic cell for which the value of the basic variable 
is zero. In other words, we have to identify an additional basic cell in the symbolic 
matrix and take the corresponding value of the basic variable as zero. For this we use | 
the property that in any transportation tableau(or equivalently in the corresponding : 


-= -e a e el 






symbolic matrix) the set of (m + n — 1) basic cells is connected, i.e. given any two 


asic cells (s,t) and (u,v) there is always a path {(s,t), (s,r), (p,r),--., (u,v)} between 
net his means that we can start from the cell (s,#) and moving along 
row/column via basic cells we reach the cell (4,0). | 

| iven b.f.s is de JE nerate, we have less than (m+n— 1) basic cells. Therefore 





i 
LA 
t m e n AIr er y ps asea A B E N 
margi TL (Mi t i i ne af ija p 7 ath 
DO ne Rai TR T a ~ 
y aa” b 7 E "a a 
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, Í the tableg 
y Ht 
). So we have V 


f(m+n-1) cells becomes connected, It jg 
) Sb Y ‘infact there are ms 
ell, ye 7 i to be (1,2) or (2,1); infact the 7 ure rie í 
this cel t | » take X12 = U as a basie 
val then we L K 

‘bilities as well. If we decide for the cell Sa he following b.f.s of (m +n — 1) 
= agape ic ell (1,2) as a basic cell, giving © 

make the cell ti ' me as usual. 
haem wae start the transportation algorithm as 

sic cells art 


between cells (1, 1) and (2,2 
that by taking that additional € > 
not difficult to see that we can ta 


10 20 15 





The main reason for getting one basic cell short is that as min(a,,b,) = min (10,10) 
= 10, we meet the source constraint as well as the destination constraint simultane- 
ously, thereby deleting the first source and first destination at the same time. Whenever 
this happens we shall always be short of one basic cell. Therefore depending upon the 
situation, we may have one or more number of basic cells less than the required number 
(m+n—1) and we will have to introduce appropriate number of basic cells at zero level 


as explained above to get the m+n -—1 basic variables so that the algorithm could be 
started. 


While adding a cell at zero level and encircl 
other nonbasic cells at zero level is the correct 
implementation. If we are solving the problem 
between two ‘zeros’, one which is 
variable), but there may be some 


ing it so that it gets differentiated from 
way, it may create some problem in the 
by hand then we can always differentiate 
encircled and the other which is not(for a nonbasic 
problem if we are working on a machine. Therefore 


; i iable as a small number E( 
say 0.01) and make the appropriate modifications as Shown below and implement the 
algorithm, 
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Y KORG 
15 
20 

5.9 The Unbalanced Transportation Problem | 

We have defined a transportation problem as a balanced transportation problem if 

Lifi = ee fei b;. A transportation problem which is not balanced is termed as an 
0) inbalanced eration problem. Therefore the general transportation problem 
e- 
a Min Z= 3 y CijXij HT 
ne il Joa i 
a subject to 
el P 
J)e | $ ti <4 (= Ngo on ttt) 

j=l 

m m 
e o $ xjzbj (j = tetr) 
e vet l ‘a | 
= | = 2 nate; R r) (5.22) | 
ic | xj 29 c | 
‘er. 14 < È _ bj. We shall discuss each of 
7 E EN will be unbalanced if };- 14i > Ei AON Li=1%i 
( a a) these dwo. Gases MaE Tay: In our ‘discussion, we onal Be ne that cij 2 0 for alli 
eee 
y A } and Js 


r alli and j. Let (x;,) be an optimal 


| 

a solution of (5.22) then | 
eae Result 5.9.1 Let cij 2 0 fo | 
f 
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proof can be giv’ th destination will incurr ad. | 


i= __,n) to the J , ‘ | 
a (J it wind pi optimal solution. Since requirement 
ore i | 





Though a formal mathematical 

ditional cost(as cij 2 0) and there and we wish to minimize cay cost of transportation 
ination is bj or more | | ll i and J. 

: rs Mpeg thing more than bj as ¢ij 2° oie 

we sho 


Case 1 Let yar Sh 
i=l 


j=1 
In the following we shall show 
problem (5.22) can be transforme 
and therefore can be solved by the usual tran 
5.7. It is obvious that for Case 1 problem (5.22) 
solution. In view of Result 5.9.1, problem (5.22) 


s case, the given unbalanced transportation 


d into a new transportation problem which is balanced 
| sportation algorithm discussed in Section 
is certainly feasible and has an optimal 


is equivalent to 


that in thi 


m n 
Min Z= », y Cij Xij 
j=1j=1 


subject to 
n 


Da7 C: ES) 


j=1 
m 
Y =b (= E T) 
i=l 
xjz0 for alliandj. (5.23) 





We now introduce m slack variables xjn41 (i = 1,2,... ,m) in each of the source 
constraints of (5.23) to get 


m n 
Min a Ds CijXij ar OX n+l arses ate OX n+1 


1=1j=1 
subject to 
: n 
l = Di + Saas = a G@=1,2,...,m) 
i T 4 : s li e 5 z MS 
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n+] 


Min y $, Cij Xij (Chay = U, 1=1,2,...,m) 


i=1j 


subject to 





n+] 
Lusa G2 1,2... m) 


mi 
xj =b; (j=1,2,...,n) 
1=1 
xij 20 for alliand]. (5.25) 
n 
But Xin+1 = 4i — >. Xij and hence 
4 


II 
Ms 

A 

| 
x 
A 
mM: 

Ra. 
Sy 


Be. Ga, (5.26) 


In view of equation (5.26), problem (5.25) can be rewritten as 


Min yi y Cij Xij (Cin+1 = 0, j= 1,2,. š .,M) 


subject to 
n+1 


Y ij = (i =1,2,...,m) 
| : j=l 
| m 


$ ži z bj = 2 


s : 
— 


+ — 


n,n +1) 
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yt, = 5, Xs = Oia = by ee = 14, Ke 
_ sot om he above bls = a ZA iaa a 
these are Popid to the first destination (because Xy = 8), we infer that the requirement 
of the first destination will fall short by 8 units of commodity. 
: blem given in Example 5.9.2 but with the 


Ties. | tion pro 
Example 5.9.3 Solve the aah the first destination has to be met exactly. 


additional information that the requirement of 
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An optimal solution of the above balanced transportation problem 1s obtained as 


13 18 14 





as pea ip above tableau, the solution of the original problem is obtained as x‘, = 
, 2 = 1, X= 14, x3, = 3, X32 = 9 and other x*. = 0 (i = 1 2pos Fe 159.3 mia 
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j} person charges to complete the i job. Here we also assume that only one person is 
to be assigned to one job and vice-versa. In assignment problem, we wish to determine 
the assignment of persons to the jobs so that the total cost of completing all the jobs 1s 
MUI. 
To get a mathematical model of the assignment problem we introduce variables 
xy t= 1,2,...,%; ia £,2,...,%) as follows 


Dre 1 if the i" person is assigned the P job 
Q otherwise. 


Let X = (xj) be the (n X n) matrix with entries as x;j. As only one person is to be 
assigned only one job and vice versa, if there is ‘one’ in the i" row and the ia column 
of the matrix X then all other entries in that row and column must be zero. Therefore, 
the mathematical description of the physical constraint that ‘only one person should be 
assigned to one job and vice versa’ is 


n 


> a Gi Ig2,...., 2) 


t=1 

: n 

Saad GSL earn) 

j=l 

x;j = 0 or 1foralli and j. (5.29) 


We next consider the quantity to be optimized and express the same in terms of param- 
eters(i.e. cjj) and decision variables(i.e. xij). Here the quantity to be optimized is the 
‘total cost of completing all the jobs’ which can be expressed as 


n 


n 
> $. Cij Xij (5.30) 
i=1j=1 

because in (5.30) the contribution from the term C;jXij is ¢jj or ‘zero’ depending upon 
whether the i’ person is assigned the jt! job or not, and then we are summing only such 
Cij's. Therefore the mathematical model of the assignment problem(AP) is 


re. 


me ly 


; n n 
ee ere Ts, Min z=) Doi i 
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think of the assig umpa etc) which 

aaen hi te nace eng. fire statom a: ste | ie of asvigning 

sree saa sical locations of a city. Let Cij = h timal location of th 

Aa le pm j" location Our aim here is to find the op e 
i" facility to 


given n facilities to the given n locations. 


We now have the following definitions. | 
). An (n X n) matrix D = (dj) is 


7 j atrix 
Definition 5.10.1 (Doubly Stochastic %4 n-negative and all its row sums and 


called a doubly stochastic matriz if its entries are no 
column sums are unity, t.e. 


di; > 0 for alli andj. 


Definition 5.10.2 (Permutation Matrix). An (n x n) matrix P = (pij) is called a 
permutation matriz if its entries are ‘zero’ or ‘one’ and all its row sums and column 
sums are unity, i.e. 


) P =1(j =1,2,...,n) 
i—i 
n 





E = 1(@=1,2,...,n) 


j=1 
Pij = 9 or 1 for alli andj. 


Let D denote the set of all (n x n) doubly stochastic matrices and P denote the set 
of all (n x n) permutation matrices. Then we have the following result 


Result 5.10.1 The set D of all (nxn) doub 


ly st 
and permutation matrices P y stocha 


stic matrices form t 
a conver sé 
EF are its extreme points. 







iT 
Ae 


; agerer to prove that the permutation 
5... Straight forward in general. Some 
WHICN 1S relatively naa M i 
ae < aslvely easy. The following 
estit Racwlh /r 


“sult (9.10.1) more clearly. 
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Pi=(t 
aaa 


Clearly D is a doubly stochastic matrix and P 
another permutation matrix 


and 


1 is a permutation matrix. Let us consider 


i eee (0) 
P = 
.=( 9) 


Then D = 2/3P; + 1/ SP 2, 1.e. D can be written as a convex combination of permutation 
matrices Pı and P2. This follows because D is a convex set having finitely many extreme 
points, namely all (nxn) permutation matrices. Therefore every doubly stochastic matrix 
D € D can be written as a convex combination of its extreme points, namely P € P. For 
n = 2, the only extreme points of D are P4 and P». 

In view of the above, we can also view the assignment problem (5.31) as an opti- 
mization problem over the set of all (nxn) permutation matrices P, i.e. given an (n Xn) 
cost matrix C = (c;j) we have to find an (n x n) matrix P = (p;,) EF such that 


n n n n 
5 D Cij Pij = M È y Cij i (5.32) 
i=1j=1 i=1 j=1 

Further, because of Result 5.10.1, the optimization problem (5.32) can also be viewed 
as an optimization problem over the set of all (nx n) doubly stochastic matrices D. The 


main argument used here is that in (5.31) we are minimizing a linear function so the 
optimum will be attained at an extreme point. | 

The end result of the above discussion is essentially the fact that in (5.31), we can 
replace the constraint xj; = 0 or 1 simply by xij 2 0 for all i and j. Therefore the 


mathematical model of the assignment problem can also be taken as 
n n 
Min Z = y, yo Cij Xij 
(= fa 


subject to 
n 


i=1 


M E Zah) 


n 


xij =1 : (= TE D) 
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quivalent of the Assignim 


, bserved that if availabilit; 
i tion problem we O , j: 
While studying the balanced transporta 
a; and san Acta b; are integers then its optimal aie Kk yaar Pe terms of 
~, ETN Now problem (5.33) is a special case of the balanced transp problem 


T b; are integers, its optimal 
i B a d b; = 1 for all j. Since a; and bj ima 
pee) Si or a i] also be in integers. But a; = 1 for all i anq 


solution x*,(as a transportation problem) wi fie 
ij t that x‘, has to be a non negative integer 


b; = 1 for all j. This together with the fac | 
for all i and j, implies that x;; = 0 or 1. Therefore the assignment problem (5.31) is 


equivalent to the special transportation problem (5.33). | 
The above discussion suggests that a possible way to solve the assignment prob- 
lem (5.31) could be to solve the equivalent transportation problem (5.33) by the usual 


transportation algorithm. This is illustrated by the below given example. 


ent Problem 
5.11 A Transportation E 


Example 5.11.1 Consider the following cost minimizing assignment problem 





and solve the same by solving the equivalent transportation problem. 


Solution Here m = n = 3, a, = ma = a3 = 1 and bi = bp = bz = 1. Also cy = 20, cy = 
ZT, C13 a 30, C21 = 10, C22 = 18, C23 = 16, C3, = 14, C32 = 16, and C33 = 12. Therefore 
the equivalent transportation problem to be solved is 


1 








PZ 
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v= 19 v= 27 v= 25 


Therefore we obtain x* =y 
o N ie = 1 =x), = x}, and other Xy = 0. Since “= 1 if the ï 
person 1s assigned the j job, we pbte 
| ain 
manen Da a ta ae a solution of the ees assignment 


O10 240) | tT ail ‘her th e mir imum cost of completing all 
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| gh tion ns are higi ily degenerate. In any tableau, 
of these (2—1) basic variables, (2n—-1)—n = 
jevel Such high degenerac , 
ion tableaus) we Pan pia large sequence of simplex tableaus(equivalently 
B o Buve a slosh: y improvement in the objective function value. There- 
fore we need i at problem (5.31) and see i hi adone 
l f something better can be 
rather than just applying the usual transportati “ie 
it is accomplished by the H (a ani algorithm. Fortunately this is possible 
and y the Hungarian Method which is being discussed here in the 


subsequent sections. 


we need (2n~1) basic variables. But out 
(n—1) basic variables are always at the zero 


5.12 An Important Lemma 


Let AP(C) denote the assignment problem for the cost matrix C = (cij). Let Ce (ĉi;) 
where ĉj = Gij +a; tpj, a © R,, (i= 1,2,...,n) and B; € Rs, (j = 1,2,..-,m). Thus the 
matrix Ĉ is obtained from the matrix C by adding or subtracting a fixed number from 
all the elements of a particular row or particular column. As the assignment problem 
depends upon the cost matrix only, the feasible region of both AP(C) and AP(C) is same. | 
We call a feasible solution of the assignment problem as an assignment. | 

: 





The following lemma connecting the assignment problem AP(C) and AP(C) plays a 
crucial role in the development of the Hungarian method for solving the assignment 


Lemma 5.12.1 The assignment problems AP(C) and AP(C) have the same optimal 
assignments. 


Proof. Let for a feasible assignment X, f(C, X) and f (C, X) denote the objective function 
values of AP(C) and AP(C) respectively. Then 


f(C,X) = y dei Xij 


i=1j= 
n n 
= Y. $ Cj + äi + bj) Xij 
j=1j=1 
n n n n n m 
ay ai Lue LaF Da 
i=1j=1 i=1 j= 1 jal ead 


n n 
| C,X)+ 7 aj t YB (5.34) 
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Therefore AP(C) and AP(Ĉ) will have the icing ae | g ‘h 

objective function values f(C,X) and if (C, X) wi 

Lemma 5.12.1, without any loss of generality we can assume 


if some Cij < 0 then we can always add Q 
t a matrix C in which all Cij 2 0, and then 


Remark 5.12.1 Jn view of mu 
that cij > 0 for all i and J. This is because 

te ts of C to ge 
positive number a to all elemen | yi 
C and C will have the same optimal assignment due to Lemma 6 


Motivation For The Hungarian Method 


We now assume that for the given cost matrix C, all cij = 0, wbiteh Ar: Si for 
any feasible assignment x;j, the objective function value Li- Pa j=1^ij Xij Z Y. Lhere- 
fore employing Lemma 5.12.1, if starting from C we get matrices 

Cj, Co,...,C, such that for the cost matrix Cp, there exists a feasible assignment 
xi such that Horb Xi = 0, then x, is an optimal assignment for Cy and 
hence for Cy_1, Cy_2,...,C3, C2, Cy and C. Here it must be understood that matri- 
ces Cj, C2,..., Ck have to be obtained from C by employing the operation Cij A; + Pj 
only so that Lemma 5.12.1 remains applicable. 

Since we want that }'_] De xi = 0 for the matrix C; and Xij is a feasible 
assignment, it makes sense to have‘enough’ entries as‘zero’ in Cx at appropriate positions. 
Thus our first goal should be to perform operations of the form Cij + A; + B; on C to 
generate more ‘zeros’ so that at some stage we get a matrix Cx which has‘enough’ zeros, 
i.e. for Cg we have Y\"_, D a ci xi = 0. The Hungarian method for solving AP(C) is a 
systematic procedure to accomplish this goal. 


9.13 The Hungarian Method for solving the Assignment Problem 





We now describe the Hungarian method for solving AP(C) which is based on generating 
enough’ zeros in C as discussed in Section 5.12. Later, in the next section, we shall have 
a more mathematical dual interpretation of the Hungarian method | 


Step 2 Choose the smallest element in each column of 
all elements of the corresponding column. This will 


_ least one ‘zero’ in 


Cı and subtract the same from 
result in a matrix C2 which has at 
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pey do not lie in the same row or in the game 
of C are independent if every two of them are 

let 


' rr i 
r be the marimum number of independent zeros in C, then r is called the index 
Ts Obviously r < n. 








column. Purther, more than two ‘zeros’ 
80, 





“ep 3 Find the maximum number of inde 


' pendent zeros in Cy, i.e. its index r. If r =n, 
stop as an optimal assignment for AP( 


eas i C2) has been obtained where the assignments 
are made at the positions of independent zeros in Cy. Since Cy and C1 both have been 


ined by employing Lemma 5.12.1 the optimal assignment for AP(C2) will remain 
optimal for AP(C;) as well as for AP(C). However if r < n, then it is an indication that 
C still does not have ‘enough’ zeros and then we go to Step 4. 


Before we discuss Step 4, we take two examples where in the first example Step 3 
results in 7 = n, while for the second example it results in r < n. 


Example 5.13.1 Find an optimal assignment of persons to the jobs for the cost mini- 
mizing assignment problem AP(C) where 


Th OAD 
Gree 18 7 


te 16 


Solution Performing Steps 1 and 2 of the Hungarian method we get 


(au) = LO 


C = O A 


Omen 2 4 
ao) o 
een: i Co = Dette) 0 
oS y Tio 8 






~ ent zeros in Cp. For this we first sc 
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As r=3 =n, we have got an optimal assignment for AP(C2) because making assign. 


ments at the position of independent zeros gives 


and P7241 27-1 oa = 0. Therefore optimal assignment for AP(C1) as well as for AP(C) 
is also X". Thus the optimal assignment of persons to jobs is P; — J2, P2 > Jz and 


P3 — Jı with the minimum cost of assignment is 20+17+12=49. 


Example 5.13.2 Find an optimal assignment of persons to the jobs for the assignment 
problem AP(C) where 


20 27 on 
C=|10 18 16 


14 16 12 
Solution Performing Steps 1 and 2 of the algorithm we get 


0 7 % 
G= 2. ygan 
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‘Therefore r= 2(< 3) and so have to g0 Step 4. Now we discuss Step 4 of the Hungarian 
method. As the basic aim of this step is to possibly generate more ‘zero’ in C2 by 
| Lemma 5.12.1, it makes sense to cover the existing ‘zeros’(both assigned 
as well unassigned ®) of C2 so that they are intact as far as possible. For this we draw 
the minimum number of lines to cover all ‘zeros’ of Cy. In this case we need only two 
lines to cover all ‘zeros’ of Cz as shown here. It is not surprising that we need only two 
lines to cover all ‘zeros’ of C2 and also for C2, we have r = 2. This follows because of 

Koéing’s theorem stated below. 


Theorem 5.13.1 (Koéing’s Theorem) The minimum number of lines to cover all 
‘zeros’ in C2(tn general Cx) equals the maximum number of independent zeros of C2 (in 


general Cy) s 


After isolating the existing ‘zeros’ of C2, to generate(if possible) more ‘zeros’ in C2, we 
choose the smallest element of C2 which is uncovered. Let this be s. For our example 
s = 3. We next subtract s from all elements of C2 to get 


=) WO 7 
EMA T 
C= ou M -3 
-1 -3 -3 


; 1 
and by Lemma 5.12.1, C2 and ey have the same optimal assignment. But in 6? we 
have lost the ‘zeros’ of the first column and the third row. To recover these ‘zeros’ we 


1 Ae 
add s = 3 to all elements of the first column of C ) This gives 


AE O E ls) 
C, TA 
23 -3 


: ¢ c2 
Ina similar manner we add s = 3 to all elements of the third row of C,” to get 
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: 


i 
N > 


ji 






“3 


Tay 
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von C; could have been o 
s to cover all ‘zeros’ Of Cz. By Koeing’, 
nt ‘zeros’ in C2. 

1, i.e which does not lie on eal 


btained from C37 directly by performing 


If we call CL” as Cs tt 
Step 4 as described below. k 
Step 4 (a) Find the minimum number of line oneal 
theorem this equals the maximum number of indep E 
(b) Choose the smallest element in C2 which is uncovere 
line. Let this number be s. 

(c) Subtract this number s from all 


(d) Add this number s to all elemen an A | 
(e) Leave other elements(i.e. those elements which lie only on one line) as such. 


Step 4 will result in the matrix C3. Here it must be noted that C3 is infact the matrix 
resulting from the matrix C2 via of Co and Sy . Therefore C3, C2, Ci and C all have 


the same optimal assignment. l 
Step 5 Now go to Step 3 replacing C2 by C3 and continue. 


For our example, we get 


elements of C2 which do not lie on any line. 
ts of Cz which lie on two lines. 


x [0] 7 
C,=|(0] 1 3 
| : « © 


ee ee Index C3 = n = 3. Therefore we stop and get an optimal assignment for C3(and 
nce for Cz, Cj, C) at the position of independent ‘zeros’. This gives the optimal 


assignment as P} > Jy, P > h. P T 
27+10+12—49, 2 > Ji Ps > Js with the minimum cost of assignment as 


i aspects in Section 5.1 ð, we do present 
pos mber of lines as desired 
Algorithm for Findin , 


n (45: 
Ong “Le 
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w) Draw i n all the unmarked rows and marked columns. This will give 
the minimum pa of lines to cover all the ‘zeros’ of Co. 

— the above algorithm for the matrix Cz as obtained in Example (5.13.2). 
We ve 


$s Hiv 
4 6\¥ 


C) 
tJ 
f 


As there is no assigned zero in the second row, we mark (y) that row. Now in the 
mark row(i.e. the second row for our example) we look the position of unassigned zeros 
and mark the corresponding column. In our example, the unassigned zero is at the 

ition (2,1) so we mark the first column. Now in the first column(i.e. the marked 
column) we look the position of assigned zeros and mark the corresponding rows. Here, 
in the first column, the assigned zero falls in the first row so we mark the first row. But 
there is no unassigned zero in the first row so the chain of marking has ended. Now we 
draw the lines through the unmarked rows and marked columns to get the minimum 
number of lines to cover all the zeros of C2 as shown here. 


514 A Dual Interpretation of the Hungarian Method 


We consider the transportation equivalent of the assignment problem AP(C) given at 


(5.31) and write its dual as 
n n 
Min y Ui + X, Oj 
= jel 
subject to 
u;+0j<cij for alli and j. (5.36) 


Here u; and v; are the dual variables which are unrestricted in sign. Let x;j be a 
feasible assign nent of AP(C) and Ui (i = { Bee Á gt), Uj (j = k, pA os x,t) be feasible for the 
dual problem (5.36), then for optimality 


| al Pia’ a 1 n n = — 
. ae i i=] j = 
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p È (cij - uj ~ 0)) xij =0 


is lf= 1 ; 
i i 0 as xj and (hA) are feasible to (5.31) and (5.36) 
ZU et ij / 





i.e. 


(5.37) 


But Xij > 0 and Cij = Cj — Ui — Vj 
implies 
respectively. Therefore (5.37) imp 


Ci Xij = 0 for all í and j (5.38) 


ti | complementarity conditions which 
timality conditions (5.38) are the usua e oe 
Tke be wait for optimality. The Hungarian method, as ean oa wee 5 bs 
essentially finds feasible vectors (Uj, v;) and x = (xij) such that (5.38) is satisfied. We 
now explain this interpretation of the Hungarian method. 


1. In Step 1, we find the smallest element in each row of C and subtract the same from 
all elements of the corresponding row to get the matrix Cy. This essentially finds the 


dual variable u; (i= 1,2,...,n) given by 
ui = Min (cij) 
1<jsn 
and then a? equals cj; — u; for all 1 and j. 
2. In Step 2, we find the smallest element in each column of C4 and subtract the same 


from all elements of the corresponding column to get the matrix C2. This essentially 
finds the dual variables vi (J =1,2,...,n) given by 


; 1) I 
O = Min(c) = Minler — u. 
J aoe ij nicy ui) 





and then oy equals cij — uj — v; for all i and if 

Here it may be noted that ci j® >0and Cij —Uj —0; 
of Step 2, a feasible solution (ui, vi) of the dual of 
3, when r < n, we essentially update the curre 
so that eventually we obtain a feasible solution 
which the complementary slackness c 
i: get the optimal assignment xX i but 


2 0 for alli and j. Thus at the end 

AP(C) has been obtained. In Step 

nt aia solution (u;, vj) suitably 
k 

: (u;”,0™) of the dual of AP(C;) for 

onditions (5.38) hold. At this stage we not only | 

aso have Gy) — ui) =o = 0 for all 2n—1) basie | 


Ay, 


AVIS 
(/Olhis 
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af sth i 

@ d9 j" column is covered(i.e. it is a marked column 
fi | 0 otherwise i (5.40) 
This gives 
o= 0, 

] 

2 2 : 

where us and oe are the updated u; and v; values as given above, and $ 1s the 
smallest uncovered element in C). 


tae ne) aA 
Ci wae aes 


We illustrate this with the help of the below given example. 


Example 5.14.1 Consider the assignment problem AP(C) of Example 5.13.2 and find 
its optimal solution. Also find the optimal solution of the dual of AP(C). 


Solution We have 
20 2% 730 
Gann Sale 


ZG WW 
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os 
C3 = [0] 1 8 
5 0 0 (5.41) 


we have 


. $ CG 
Also looking at the position of independent zeros 1n \3 


(5.42) 


But from (5.41) and (5.42) we have a x;,=0 for all i and j and hence g conditions 
(5.38) are satisfied. Therefore X* is an optimal assignment for AP(C), i.e. Py > fh, 
Po — Jy and Ps) jp: . 

To get an optimal solution of the dual of AP(C), we may consider the transportation 
equivalent of the given problem and solve the same by the transportation algorithm. 
We may refer to Example 5.11.1 in this regard where the problem of the given Example 
9.14.1 has already been solved by the transportation algorithm. In the last transporta- 
tion tableau, we not only get the optimal assignment but also an optimal solution of the 
dual problem. As obtained in Example 5.11.1, for the given assignment problem AP(C), 
the optimal solution of its dual is u = 0, ™ = —9, 3 = =, vi = 19, 09 = 27, 03 = 25 
and the optimal value of the dual is 49. 

There is a much better way of getting an optimal solution of the dual of AP(C) 
without resorting to the transportation algorithm from the very beginning. From (5.42) 


the optimal transportation tableau of AP(C) is 
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eich gives" as F y #2 +01 = 10, uz + v2 =18, uz +v = 16, uz +v = 12. Taking 
g = 0 we get U2 = —7, u3 = —13, vı = 19, V2 = 27 and v3 = 25 as an optimal solution 
dual of AP(C). 


this context it may be noted that the optimal solution of the dual of AP(C) is not 
„aue. This is because there are alternate possibilities of putting the ‘zero’ basic cells 
in the tableau and choosing arbitrary value of one of the variables ui and Vj. 


pemark 5.14.1 While using the Hungarian method, it is not true that with each it- 

eration the number of ‘assigned zeros’ increases by one. For example, if we consider 
problem AP(C) for 

if 4 - 10° Te S8 

mM, 6) 15.25 

C 395. 10 25 40 20 

36 16 40 64 $32 

42 12 30 48 24 

28 s87 20. S32 


then we can verify that the mazimum number of independent zeros in Cz is 2. But the 
mazimum number of independent zeros in C3, C4, and C5 remains 3. Further it remains 
4 for Ce and C7, and the optimal assignment, i.e. six independent zeros are obtained only 
for C12. The only guarantee of the method is that the maximum number of independent 
zeros will not decrease and for some finite k, AP(Cx) will have n independent zeros. 


A GO CO U1 W NY 


5.15 Finite Convergence of the Hungarian Method 


In this section we prove the finite convergence of the Hungarian method. Let us recollect 
that for the assignment problem AP(C) with C= cijim Steps 1 and 2 of the algorithm 
ra .— U; 

give the matrix C2 = ie) where for all i and J, g i= (cij — Ui — Vj) 2 0. i 
Further, in case r < n, Step 4 of the algorithm generates the matrix C3 = (cj; ), 

9) (0) EOT vP) where u” and v” are updated dual variables as per equations 

O ede ; 

E og 2) _ 2) _ 9 all i and j. Now from equations 

(5.39) and (5.40). Also 2 = a -u — 0," 2 0 for J ; 

(5.39) and (5.40) we have 

s :f row i and column j both are unmarked 


avang yo Je s 
re tity m goi ad DA 
Aa ALAN L . 
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=lj= i=lj=1 i=1lj=1 
= p(n — q)s — ("n — PS 
— n(p ci q)s, (5.43) 


e number of marked columns. 


‘cows and 4 is th 
where p is the number of marked rows and q 1s t 3, 9=3, p = 2, and q = 1. For 


We can verify (5.43) for Example 5.14.1, where n = 


this example 
0 3 m9 


Gp -G= o S 238 
kE O E 


87-3 
OTON 7528 — up —q)s = 3(2-—1)38 = 9. 
Therefore x >, (£ — ci ) — 12 — 3 = 9 which equals n(p q)s ( ) 
a la aa ee 

Let r denote the maximum number of independent ‘zeros’ in C2. Then by Koéing’s 
theorem r equals the minimum number of lines to cover all ‘zeros’ in C2. But this 
minimum number equals the sum of the number of unmarked rows and the number of 
marked columns, i.e. r = (n—p) +q, or equivalently (p—q) = n—r. Therefore (5.43) gives 


y y By - aa =n(p—qg)s=n(n—-r)s. (5.44) 


i=1j=1 


Now for r < n i.e. when the current assignment is not optimal, n(n — r) s > 0. 
Also without any loss of generality we can assume that C has integer entries(when ¢jj 
are rational, we can always take L.C.M), which gives that s is an integer. Therefore 
n(n — r) s is an integer. 

The above discussion shows that at the end of Step 4 we have (i) a new matrix Is 
mee a markings can be continued (ii) the sum of all the entries of 

y rix (here i itive j be 

entries of the matrix C3 a eo a i a = 


Hence the Hungarian : 
_ ence the Hungarian method is bound to termi 
aE ethan : a in finite number of iterations as 
a gi 
t 
Ba. 






i F 
J = 






` M- 


me 
> 





a D 
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Sections 5.3-5.8 present the com 


è wing the balanced TP. The i plete working of the transportation algorithm for 
solvi gr 'Mportance of unimodularity and related results are 
discussed in Section 5.4. 

ced trans A ba 

» The unbalan isportation problem is discussed in Section 5. 


i 10 discusses 93. 
a Section 5 ; the assignment problem and gives its mathematical model. 
The transportation equivalent of the assignment 


blem is given in Section 5.11. 
» Sections 5.12-5.14 present the problem is given in Section 


complete working of the Hungarian method for solving 
the assignment problem while Section 5.15 establishes its finite convergence. 


e The transportation problem was first formulated by F.L. Hitchcock in 1941 which was 
later discussed in detail by T.C. Koopman in 1949. L.V. Kantrovitch also discussed 
the transportation problem in 1951. 


ə The adaptation of the simplex method to solve TP was given by G.B. Dantzig in 
1954 and also by A.Charnes and W.W.Cooper in 1954. 

e There is a very close connection between the transportation and assignment prob- 
lems, and certain bipartite graphs. For a graph theoretic treatment of the transporta- 
tion and assignment problem we may refer to the texts by Berge and Ghouila-Houri 
[18] and Bazaraa et al. [12]. 


e The Hungarian method for solving the assignment problem was developed by H.W. 
Kuhn in 1955. 
e Similar to the cost minimizing transportation and assignment problems, we may also 
study time minimizing problems. These problems have been studied in the literature 
e.g. Hammer [74], Garfinkel and Rao [64], and Swarc [152]. | i | 
A very natural extension of the transportation problem is the multi-commodity trans- 
portation problem; originally studied by Haley [73], who also gave many variants of 
the same. These problems have many applications in mining industries. | 
Burkard [30] studied the cost minimizing TP and the time minimizing TP in an 
unified setting of algebraic linear programming. He developed a general es, 
method for the algebraic transportation problem which iS applicable to bot ` e COs 
minimizing as well as the time minimizing aon e t ne 
portation problem over a commutative ordered semigroup is ac whic e “ee 
both types (cost and time) of transportation problems = | 
studied the algebraic assignment problems in the setting of alge progr 
ming. 
© Tf we allow that an item may 
destination or a combination of these, 
_ the transportation with tran shipment on 


reach a destination via another source or via another 
then the transportation problem is called 
in short the transhipment problem. In this 
tation prov. without tranship ment. The 
“a „n destination) can be transformed into 


a 





















¢ ea 
T la A 
or a: (1 
\ i Ee 
pi q 
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e The assignment problem 1s also related wi 
lem (TSP). In fact solving an 1! city, trave je 
finding a cyclic solution of the usual conn i 
literature, there are certain approximate a gori 


equivalence. 


th the famous travelling salesperson rh 
lling salesperson problem is equivalent to 
-n-jobs assignment problem. In the 
hms for solving TSP based on this 


5.17 Exercises 
5.1 Consider the following (TP) i 





Determine a starting b.f.s by employing 
1. the north west corner rule 
2. the column minima 
3. the row minima 
4. the matrix minima and 
5. VAM(Vogel’s Approximation Method). 


(i) 
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95 10 20 

20 

E 4 
10 

7 5 2 | 
15 | 

7 le 12 | 
15 | 


x œ ercess suppl 
3 Solve the following TP and identify the source which will have excess SUPP y 
p 45 10 -15 





sometimes there are penalties for unsat- 


lem, 
5.4 In an unbalanced transportation “te er to meet the required demand. Consider 


isfied demand, to reflect the failure of the suppli 
the problem 
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| wen al Question 5.4 above. Suppose iy 

‘onsider the rtation problem given a "ES à 8 
bee ak t the Bind ot the destination Dı must be satisfied exactly. Suggest a pro. 
cedure to ni the given transportation problem and also identify the destination which 


will have maximum short supply. 
5.6 Solve the following (TP) by finding the initial solution using the method of column 


mimma 15 18 1 j 





Here the destination D3 has black listed source S3 and will not accept any supply 
from that source. 


5.7 Solve the following cost minimizing transportation problem starting with the b. Vins 
x12 = 30, X21 = 40, x32 = 20, x43 = 60 and other Xij = 0. 


40, 50 60 





5.8 Consider the following transportation 


problem. 
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Using the column minima method to find a starting solution, obtain the minimum 


ost transportation schedule. 
9, Write the dual(D) of the given (TP) and obtain its optimal solution. 


5.9 Consider the following transportation problem 
. 1 3 





1. Prove by duality theory that (a3 =Q, Fis = oo ae =p, ton — © 
a 05 = 4) is an optimal solution | ME 
9, Will this optimal solution change if each cost element ts multiplied by 1U: 


5.10 1. Solve the following LPP 
Min 3x1 + 4x2 + 2x4 + 3x5 + 4x6 
subject to 
Xj +X2 + x3 = 13 
x4 + X5 +X =9 





x, +%x4 = 8 
Xo +xX%5 =4 
x3 +X =6 


Xy, X2, X3 X4, X5, X6 Z Ü. es | | 
9. Write the dual (D) of above LPP and obtain its optimal solution. 


5.11 Consider the following assignment problem(AP) 
h 






















Mps ini st assignment. 
De pi os Gites assignment if it is given that Job J4 can not be assigned 


20 ’ - 
an > MD: . ye 
A" QY 
’ L J uy Stet. = 
N x d v kgh £ Š , f 


if +h >» erson is sure to do the job Ja. À 
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IV 





5.13 Given the following data, find the optimal assignment of territories to salesper- 


sons so that the total sales are minimized. 
Territory| Annual Sales(in Rs ) 





ii 60,000 
II 50,000 
III 40,000 
IV 30,000 


Also working under the same conditions, the yearly sales of persons are in the fol- 


lowing proportions 





5.14 Consider the following problem (AP) i 





1. Express the above (AP) as a (LPP) and write its dual (D). 


2. Treat the given (AP) as a special 
tion algorithm. ) P case of (TP) and solve the same by the transporta- 





| i ofits dual A 


j i leo 4 airy 
J USE tie ans Wer | O00. aine oqi at ‘14, 






d a solution of the dual (D). 
A: Pp ve p h y be 
jects. F “ive projects S lare a to the 


NOUusand nt 
SF the and Oj KR UY DE ee, s) 
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P3 








Ps 


Ey 82 
Ep 59 
Es 


81 
50 

Due to organizational constraints, only one engineer can be assigned to a project and 
vice-versa. Project Pı is very important and it must be taken by the company. Find the 


optimal assignment of engineers to the projects so as to maximize the total profit. Also 
identify the project which the company should decline. 


5.16 Consider the following LPP 


Max 2x1 + 3x2 + 4x3 + x4 + 7x5 + 5x6 
subject to 

Ml at eS | 

TER on MoS Ml 

XiT Xe SI 


X e a = l 
X2 +x4+xç=1 
TNA N, XE = 0. 
1. Solve the above LPP. 
2. Write the dual of above LPP and solve the same. 


5.17 Are the following statements true? Give reasons for your answer. 


1. For a balanced transportation problem with 4 sources and 3 destinations, the columns 
P11, P13, P22, P23, P42 and pas of the coefficient matrix A are linearly independent. 

2. If all the cost elements cij of an (4x4) assignment problem (AP) are increased by 5, 
then the optimal value will also increase by d. 

3. The set of all (n x n) permutation matrices 1s a conver set. | 

4. A balanced transportation problem with all cij 2 0 can not have unbounded solution. 

5. If A is unimodular matris then so 1s Ar. 

6. All matrices having entries as 0, +1 and —1 are urio i 

7. The vector p32 — p22 + p43 has exactly two entries as ‘one. | 

8. Let X = (Zij) be an optimal assignment of AP(C) then it is also optimal for problem 

_AP(C?). 










ay reo : 
HR a fy.’ 


> i if JES > : 
LOS EY pen ee ee et mR NT 17, amy each oJ 
. Construct an example oj CY ULL OJ 





the following (if no such example is possible, 
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Integer Linear Programming 
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`. 6.1 Introduction 


| Certain linear programming problems require that some or all the variables take on only 
integer values. We call such problems as integer linear programming problems (ILP’s). 
Integer LPP’s are very common in real life applications, e.g. in a production problem, 
the items being produced may be in complete units (say items are T.V. sets of 21” 
and 29”) and therefore fractional number of items may not have any meaning. Also 
| sometimes, because of the physical restrictions, the constraints of the problem could be 
‘either - or’ or ‘at least k out of the given m constraints’ which again lead to ILP’s. A 
special type of ILP, namely 0 — 1 ILP, occurs very frequently in the area of electrical 
circuits. 
Integer linear programming problems, inherently being combinatorial, constitute a 
‘hard’ class of optimization problems. Our first aim in this chapter is to understand the 
difficulties which arise in the algorithmic development of ILP’s and then to discuss two 
basic algorithms for solving ILP’s, namely Gomory’s cutting plane method and Land 
and Doig’s branch and bound method. Though there are several modifications of these 
two basic methods (so we have a class of ‘cutting plane methods’ and ‘branch and bound 
methods’) we do not attempt to discuss them here. In literature, the theory of finite 
abelian groups has also been applied for solving ILP’s but that again has been left out 
in our presentation. 








6.2 Mathematical Model and Some Possible Approaches 


į 
7 
fa 
rt 
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n 


Marx z= >, ms] 
j=l 


subject to 


and 
x; integer for Ji CJ, 


where | = {1,2,...,n}. ) 
An gh is called an all integer LPP (AILP) if all the variables of the problem are 


constrained to take integer values only. The problem is called mixed integer LPP (MILP) 
if some but not all variables are constrained to take integer values. Thus, in (6.1), if 
Jı = J. then the problem is all integer, otherwise it is a mixed integer LPP. Here it may 
be observed that as the constraints in (6.1) are given in Ax = b form, the problem is 
all integer if all variables, including the slack and surplus variables, are constrained to 
take integer values. Thus the problem 
Max Z = 4x, + 3x2 
subject to 
+X <8 
2X1 +xX2 < 10 
X1,X2 = 0 
x; and x2 integer 


(6.2) 


is an all integer LPP, because once x, and x2 are non-negative integers, the slack variables 
%3 = 8—xX1—x2 and x4 = 10—2x) —x are also non-negative integers. Further if in (6.2), we 
take the second constraint as V2x1+x2 < 10 (instead of 2x, +x2 < 10), then this toi 
is no more all integer. It becomes a mixed integer LPP b ee i = 10 — V2x $ ill 
not meet integer requirement even if xı and x are integers. 4 To 


For the integer LPP (6.1), we define its associated LPP 


from the ILP (6.1) when integer const , namely the LPP obtained 


raints ‘x; j 
the associated LPP (say (LP);) of the au. ED for je J, cJ are ignored. Thus 
n 


Max =} En 
JP Z : £ CjX; 


(6.1) is 
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o most natural and common sense approach of solving the given ILP (6.1) seems 
to be to solve its associated LPP as given in (6.3), by the simplex algorithm and then 
60 off the optimal solution so obtained to the nearest integers, The example given 
: prings out an important fact that rounding off is not a correct approach to solve 
p's because this, in general, will lead to infeasibility and/or non optimality. 
pet us consider the ILP given by 
Max z = 21x, + 11x3 
subject to 





7X1 + 4x0 < 13 
Kis Ko 2 0 (6.4) 
Xı and xz integer. 


The feasible set of the above ILP is the following set of six discrete points 
(0,0), (0,1), (1,0), (1,1), (0,2), (0,3). 


These points are essentially the points having integer coordinates which are inside the 
feasible region of the associated LPP. This we can visualize in Fig 6.1. 








(1,0) = 13/7, x ,=0) Xi 


Fig. 6.1. 


: have to simply evaluate the value 
š timal solution of ILP (6.4), ee 
Bo cag 4 Figen re ve mew which are feasible. This gives the optimal solution of ILP 
A VLS A ; “ 
(6.4) as Ce =.0, xx = 3) and the optimal value as 2 33 


= Suppose we now solve the associated LPP of the ILP (6.4) by the simplex algorithm, 
Le. solve the following 








4 as a i 
ot PY apen Ome fi Pio 
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put whose roa ie g: i polytope obtained by taking the convex hull of the 
feasible set 0 8 : hus solving ILP (6.1) is equivalent to solving LPP 


j=] 


subject to 
(X1, te Xn) 5 Sy 
where S is the polytope obtained by taking the convex hull of those points (x1,-.-/%n) 
for which the following hold 
n 
Y oija = 6; i=1,...,m) 
ie 
x 0 (7S, aN) 
x; integer for j € J; CJ = {1,...,n}. 


This method of solving ILP’s is perfectly valid except that there are certain practical 
dificulties in getting the desired convex hull, particularly when the dimension of the 
Euclidean space is more than two or three. Therefore, we again do not pursue this 
approach for solving ILP’s. However, the observation that there is always a polytope 
inside the feasible region of the associated LPP of the given ILP such that the corner 
points of this polytope meet integer requirements, has led to two other approaches 
for solving ILP’s. These are cutting plane methods and, branch and bound methods. 
As ILP’s are essentially ‘hard’ problems, no algorithm can really be very ‘efficient’. 
However, cutting plane methods and branch and bound methods have proved to be 
very satisfactory in many real life applications of ILP’s and we propose to discuss them 
in the subsequent sections. As mentioned in Section 6.1, our discussion will be restricted 
to Gomory’s cutting plane method and Land and Doig’s branch and bound method only. 


6.3 Gomory’s Cutting Plane Method for All Integer Linear 
Programming 


We consider the all integer LPP 


subject to 
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LCM and get the entries as integers only. 
automatically constrained to be integer. 


j i 
Let (LP); be the associated LPP of the given A 


simplex method and let x") be its optimal solution. If 


cut constraint, 1.e. a constraint 
AILP. In this situation, Gomory proposed j , ali to get a new LPP, say (LP), 
of the form p’x < d and append pen tei t of the feasible region 5}, (S 
‘nt is to delete a part O fis 
The basic purpose of the cut constrain , chat we do not delete any point of 
being the feasible region of (LP);), but making sure definition, certainly deletes 
S, which has integer coordinates. Thus a cut constraint, by de m ? aise oti ieee 
some portion of the feasible region of the associated LPP, put bene eee = ne 
delete those points which are useful to us, namely the points having integer sie: mame 
Now we may solve (LP)2 and repeat the procedure. Gomory derived aul AEREE late cut 
constraint and established that only finitely many cut constraints will be needed to 
solve any instance of the given AILP. The cut constraint given by Gomory is called 
the Gomory’s cut constraint and this method of solving AILP is called the Gomory’s 
cutting plane method for all integer linear programming. There are many other cutting 
plane methods available in the literature but they are all based on this basic method 
due to Ralph E. Gomory. 

Before we provide the derivation of the Gomory’s cut constraint for AILP, we make 
an important observation. The LPP’s, (LP); and (LP)2 differ only by one additional 
constraint. Therefore (LP): need not be solved from the very beginning. We can use 
the dual simpler method to solve it by appending the additional constraint, namely the 


cut constraint, to the last (optimal) tableau of (LP),. This is of course true for any two 
consecutive LPP’s, say (LP), and (LP);,1. 


Therefore for AILP, the objective function jg 


LP (6.5). Let us solve (LP); by the 
x) meets the integer requirements, 
(1) jg optimal for the given AILP. 


Derivation of the Gomory’s Cut Constraint 


3 uh the Gomory’s cut constraint, we first obtain the canonical representation of 
e 


a 


Terre ae 8 8 8 zi 
T Ors!) wire 3 pak ay ah! OS og eee 
ile oy kee : : 6 
Pay à 
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eR) <1 





Bxp + Ryg = b 





j.e. i 
xg = B“b-— B`'Ryxg 
j.e. | 
AB = Yoa Me Yijx; (1=1,...,m). (6.7) i 
JER i 


Here yio İS the i" component of the vector BED Yij is the i! component of the vector 
) = Bq) and j € R is understood as the index j running over the index set of 
nonbasic variables. 

The m equations given at (6.7) are said to be the canonical representation of the 
system Ax = b, because each equation of (6.7) has only one basic variable with coefficient 
as ‘one’, and all other variables in the equation are nonbasic variables. What we have 
done for the constraints Ax = b, can also be done for the objective function, because 


T i 


Z=CxX 
aE T 
= CRXB + CRXR 


= Ch (Br DBs RAA CXR | 
= 65 (Bb) (C Baek cn) xR, | 
which can be written as 
XBo = Yoo — 2 Yoj* jr (6.8) | 
jeR 
where xg, = Z, Yoo = cT(B-1b) and yoj = Zj — Cj- Therefore combining (6.7) and (6.8) we 
get 


xB; = Yio — Y, YijX j, i (6.9) 
jeR 


0 refers to the objective function and 1 = 1,...,m refer to 


for i= 0, i aoe m, where i= and ! 
nt b.f.s. and the current objective function value are 
sa 2 QO ; 


~ the m constraints. Further the curre l ae 
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fed by the point xp for some i. This is because for some i, Xp, ÌS not integer with 
pot satisfied © the current b.f.s. XR, (6.13) gives fio < 0, which is a contradiction. Thus 
fo > 0. But a 3.13) is satisfied by every integer feasible point of AILP but certainly 
the rey i by the current b.f.s. XR. In other words it certainly deletes a part 
not i : ; ne a the associated LPP (at least the current b.f.s. Xp and may 
que Š «ae but does not delete any feasible point with integer coordinates. 
ai me mie valid cut constraint and it is called the Gomory’s cut constraint. 
k oath also write the Gomory’s cut constraint (6.13) as 


-fio = Si - = fijx; (6.14) 
JER 
and append the same to the associated LPP, (LP),, to get the new LPP, (LP)2. Therefore 
we solve (LP)2 and repeat the procedure. 


Stepwise Description 


The stepwise description of the Gomory’s cutting plane method for ATLP is as follows 
Step 1 Solve the associated LPP, say (LP), by the simplex method. Set k = 1. 
Step 2 If the optimal solution obtained at Step 1 is integer, stop 
of the given AILP is at hand. Otherwise go to Step 3. 
Step 3 For any updated constraint i whose Yio value is fractional (including i = 0, 
i.e. the objective function), generate the Gomory’s cut constraint as given at (6.13). A 
common procedure here is to select that value of 1, 0 <i < m. 
maximum. Theoretically we can choose any 7 for which 
is chosen with the hope that it may give a deeper cut. 


Step 4 Append the Gomory’s cut constraint derived at Step 3 above to (LP); to get 
the new LPP (LP);41. Solve (LP). by the dual simplex me 


as an optimal solution 


for which fio value is 
fio > 0 but the maximum of fio 


thod and return to Step 2. 


Theorem 6.3.1. The number of Gomory’s cut constraints needed to solve any instance 
of all integer linear programming problem is always finite. 


Though we shall not prove the above theorem, there are few points to be noted here. 
As the number of cut constraints needed is always finite, we are solving only finitely 
many LPP’s to get an optimal solution of the given AILP. But, unfortunately, even for 
a problem of ‘average’ size, the number of cut constraints needed may be ‘too maay 
(theoretically ‘exponential’ in the worst case) as AILP belongs to the class of ‘hard 
problems. in — 

We also need to appreciate some other computational difficulties in this ait eal 
wed ca We add a cut constraint to (LP), we immediately introduce one additional row 
and one additional column in the existing simplex tableau. Since in a pe ie te 
the number of cut constraints needed may be very large, after some iteratio 
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m 
big to manage. There are some 


ement. 
organ peep? to take care of the problem of tableau manag, 


which can be augmented 
he integer LPP 


Example 6.3.2, Consider t g 
Max z = 5x1 + 2%2 7 





subject to 
2x1 + 2x2 S 9 


Bx, + X2 = Ll 
X1, X2 2 0 


x1,X2 integer. (6.15) 


Show that problem (6.15) is an AILP and hence solve the same by the Gomory’s 
cutting plane method. Also verify graphically that each of the Gomory cut constraint is 
really deleting a part of the feasible region but not deleting any of the integer feasible 


points. 
Solution The given ILP is equivalent to 


Max z = 5x1 + 2x2 + 0x3 + 0x4 
subject to 
2X1 t 2X2 + x3 = 9 
BX + X2 + x4 = 11 





X1, X2, X3, X4 2 0 and all integer. (6.16) | l 
4 
As x1,X2 are constrained to be integers, it is assured that x3 = 9 — 2x, — 2x5 and A 
X4 = 11 — 3x; - x2 are also integers. Therefore the ILP (6.15) is in fact AILP. We now l 
employ the Gomory’s cutting plane method to solve AILP (6.16). | 
First iteration 


We consider the associated LPP (i.e. (LP)1) given by 


, 2 +0x3 +0 
subject to di. 





X3 =9 


’ “7 or ee 


2x1 + 2x9 + 


| | f P9 Go} = 








a / ə i a 
BDE ae or a 
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Max 
subject to 


z = 5x) + 2x2 


2x1 + 2x2 + X3 = J 

3x, + X2 + X4 = 11 
_0.25x3 — 0.5x4 + 81 = —0.79, 

X1, X2, XB X451 Z 0. 


Second iteration 


We solve problem (LP)2 by the dual simplex method, by taking the constraint (6.17) 
directly (as it is already in the canonical form and so will be subsequent cut constraints 
as well) in the last tableau of (LP); to get the following dual simplex tableaus 






The above tableau gi 
. DOV gives the 
_ Integer requirements. Therefore aia Bice of (LP )2 which still does not meet the 


deriv e the nex 
E c N $ Sa constraint, by choosing 
e. es erate the cut constraint 
) and derive the cut constraint 
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ie 
-0.5 = s2 - 0.5x2 = 0.554, (6.18) 
which we append to (LP)? to get the next LPP, namely (LP)s. 


Third iteration 


The new LPP, (LP )3, is solved by the dual simplex method by taking the cut con- 
straint (6.18) directly into the last tableau of (LP). This gives the following dual simplex 
tableaus 





Now the optimal solution of (LP)3, as obtained above, meets all the integer require- 
ments so we stop and declare (x* = 3, x} = 1) as an optimal solution of the given AILP 
(6.15) with the objective function value Z 

As our original AILP involves only two variables x; and x2 we can plot the feasible 
region as well as all cut constraints graphically and visualize that the k” cut constraint 
certainly deletes a part of the feasible on e ‘has (LP), but does not delete any 
integer feasi j he original AILP (6.15). 

e e op ODAF is the feasible region of the associated LPP, (LP), 


and its optimal solution is the corner point A : (3.25, 1.25) as obtained by the simplex 
he integer requirements, we derived the first 


5x3 + 0.5x4 2 0.75. The first thing we 
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a Ps tot ai : 
j artitions of the decision vec , the 
e appropriate p 
where (x, v), (A1, A2) and (c1, C2) ar 


. , and the cost vector C. k Te ace Eia 
Eare ars development in the previous section, We shall have 
analo 


ible basis matrix B. Le 
the canonical representation of problem (6.19) for a i, avg pagers: in 
us recall equation (6.9) of E. = a ee ri mo dn? e i 

hing we note that in (0.9) we ce i ail on at 

a ep it is a mixed integer case, so the ob iti e ops w 
to be an integer. Further the index set R of nonbasic variab es f: ij | R for we 
into R; and Ro, ie. R = Ry UR», such that ky consists of those indices J = k Xj 
is constrained to be an integer, and R2 = R ~ R; is the index set of remaining nonbasic 


variables. Therefore the required canonical representation of (6.19) is 


xB, = yo- Y} yj- ), vir G@=L---,m). (6.20) 
JER, jER2 
Now separating various terms of (6.20) into integers and fractional parts, we obtain 
a fijxj + >, yij0; — fio = Lyio] — X [yi ]xj — xz, (6.21) 
jER jeER2 jERi 


for i= 1,2,...,m. 
But for integer feasible points of problem (6.19), i.e. feasible points (x,v) for which 


x is an integer, the R.H.S. of (6.21) is an integer. This implies that the L.H.S. of (6.21) 
should also be an integer. Therefore, in particular j 


£, fijxj + J Yijvj — fio = 0 (6.22) 
JER, JER 
or | 


È fixi+ a Yij?; = fo S=1, 
JERI JER, 
for each t= 1,2,...,71. 


We now define following two sets for eachi=1,...m 


R ={j € R2 : yij > 0} 


and 
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J fijxj + »; Vij?) = fio, - (6.24) 


} E Ry jE R} 
and if (6.23) holds so does 
2 Yij0; < (—1 + fio), (6.25) 
JER 


use xj 2 0 and fij 2 0 for all j € Ri. 
If we multiply (6.25) by (fio/(—1+ fio)) (which is less than zero as 0 < fio < 1), we get 


ovaj? > fio. (6.26) 





As L.H.S. of (6.24) and (6.26) are both non negative and exactly one of these must hold 
(i.e. either (6.24) holds or (6.26) holds - both obviously cannot hold because of (6.22) 


and (6.23)) we have 


fioYij? 
fio - Wi Tap ps Yjj0j; + 2 acm < 0. (6.27) 
JERI jeR jER 

The constraint (6.27) is called the Gomory cut constraint for MILP. It is certainly a 
cut constraint because it is satisfied by every feasible point (x, v), x integer, of the given 
problem (6.19) and it is not satisfied by the current optimal solution of the aoaaa 
LPP, (LP);. Also, if the problem is all integer then Rı = R and Rp = @ and so Ry = 
R= 0. In this situation the cut constraint (6.27) reduces to the usual cut constraint 


fo- 2 fijxj < 0, as expected. 
jER 


The Gomory’s cutting plane method for solving MILP remains exactly same as that 


of AILP except for the following 
on (6.27) is used and not equation (6.13), 
because (6.27) is the Gomory’s cut constraint for MILP, whereas ey) ‘ for AILP. 
(ii) for generating the cut constraint (6.27), 1 = 0 is never chosen, eee BS! a ge 
-= the objective function which is not guaranteed to be integer for the mixed integer 


ost vector c are integers. | 
that i for which the variable xg; is constrained 


as , i value. Amongst these, we choose 
be integer but is currently having the fractional 5 
z ; i = ” ERS e Eh aren [ae ations art /i0 is positive most. 


(i) for deriving the cut constraint, equati 






= case even if the components of the ¢ 


} J lay 
ii HY a 
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cutting plane meth 










Use Gomory’s od to solve the following mixed in- 
Example 6.4.1. Use Gom 


teger LPP 
Max z= 2x1 + %2 
subject to 
Ny a SD <5 
6x, + 2X2 S 21 
X1,X2 20 
xı integer. 


Solution The given problem is 
Max z = 2x4 + X2 + 0x3 + OX4 


subject to 


Ri a XS — 5 
6x1 + 2X2 + X4 = 21 
X1,X2,X3,X4 = 0 
xı integer . 


We shall solve the above problem by using the Gomory’s cutting plane method for MILP. 
For this we consider the associated LPP, (LP); and solve the same by the simplex method 
to get the following optimal simplex tableau 





1/4 
3/2  —1/4 





Now the optimal solution of (LP); does not meet the integer requirements (as x, is 
constrained to be integer but is having a fractional value) and therefore it is not optimal 
to the given integer programming problem. So we have to derive the cut constraints 
(6.27 ) by choosing an appropriate value of i. Here only 1 = 1 (i.e. through the variable 
xı) is the only possibility as only x; is constrained to be integer. Therefore we identify 


R,,R5,R, etc. to get 
Ry =0 









hae k = (3,4) 


f AA e a 
AYN 


n= 









Je Integers. Next Rz = R because 
in 


‘ = 
= ~~ “a 
rie 
- Da 
- 
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~ Ri =R~O=R. Also, by desnis: 

=R > DY definitior ee arc i 

‘ables which are in Rz and for which Yj y Fee the index set of those nonbasic 


, , > (0, Fr ate i ji 
‘able iS X4 and so Ry is the index set of y is Re va i bitte only such ae 
plugsins these values in (6.27) we get the cut one oc iiaee i 


TA — ERKE S o 
4 (1=3/4) 327 


i.e: 
3 1 


EH = Sie ye a 7 %3: 


We next append the above cut constraint in (LP); to get the new problem (LP)2 which 
we solve by employing the dual simplex method. This gives the following tableaus 





XI X2 X3 X4 S1 








1/4 
-1/4 0 



















ve =e i 
w 3/2 —1/2 1 pi 
YOT ak i 
; ji i; 
So the optimal solution of (LP)2 is (x1 = 3, %2 = 3/2, x4 = 3/2, x3 = 1/2), which meets A 
the integer requirements as x1 takes the integer value, namely xı F 3. So we stop and | f 
get an optimal solution of the given MILP as (i = = 3/ 2) with the optimal value 1 
7 
2 = 30/4, Ba ot ee | 
Now, similar to Example 6.3.2, here again we can have a graphical visualization of the | 
feasible region and cut constraints etc. to verify that the cut constraint (1/ 4)x4+(3 / 2)x3 2 | | 
3/4 certainly deletes a part of the feasible region (LP), but does mot et, k e ii 
= feasible point, i.e. any feasible point (x1, X2) for which 15 am meger, yve Totei hte A 
644 în this gerard. | ABC is the feasible region of the associated LPP, (LP), j | 
Nen i n this figi TO; the p olytop i È soints in this region for which the first coordinate x; is tik 
“ad the bold lines consists ot those pol Gaat d rai 
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wag gum : region deleted by the cut constraint. 
3 It does not have any point (x1, x2) 
with xı as integer 
al 

6x1 + 2X2 = 5 a 
; is í 
\ a 
y cut constraint 3x1 + 2x2 < 12 E 
i ent 
J? cn 
j cal 
Z CO) 
of 
S D:(3,3/2) : optimal solution of St 
(LP), and also of the given MILP op 
re 
sti 
3 =(11/4, 9/4): optimal solution of z 
LP) | a 





0 l 2 oe 4 
À X tx = 5 


! 
Fig. 6.4. P 


out to be (x1 = 3, x2 = 3/2) attained at the corner 


| oint D. Fu i 
DBE does not include any point (xi; š rther, the deleted region 


x2) for which x, is an integer. 


6.5 Branch and Bound Method 


> l Sree |, nid mM . 
Litt ‘ound ie a \ ATT Aramis maana 
JULI IS an em 1enT emimerat: 
e n AND 5 it 


+ ~ 2} tt + 
s ‘n TrANnic ais > < £ 
¢ d | Ditis CAT DE 
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Max -zeclyx 
subject to 
Ax = b 


x20 
xj integer for jeji CJ = (1 $ (6.28) 





js the name suggests, each iteration consists of branching and bounding. The branching 
i done from the current node, a node being identified as a LPP, 5 re th initial 
is the associated LPP of the given problem (6 28) Certain k i d w 2 We 
lated so as to decide if further branching ee 1 appropriate bounds 
calcul ara ranching is needed or not. Though it is an 
enumerative technique, it 1s efficient in the sense that the exhaustive enumeration is 
seldom needed. Most of the times it is only partial enumeration as that node which 
cannot further EOUpLOYS the current available solution is discarded and therefore the 
corresponding region 1s no more required for enumeration. The following are the details 
of the Branch And Bound method. 
Step 1 Solve the associated LPP, called (LP); and let zı be its optimal value. If the 
optimal solution of (LP); has integer components for j € Jy, i.e. it meets the integer 
requirements of the given ILP, stop; otherwise go to Step 2. Thus in Step 1, either we 
stop and get an optimal solution of the given ILP or we have an upper bound zı for the 
optimal objective function value of the ILP. 
Step 2 Select a variable x; which is constrained to be integer but is currently having a 
fractional value f; in the optimal (LP); solution, and construct the following two LP’s 


(LP)2 (LP)3 
Max z=c'x Mex z=c'x 
subject to subject to 
Aga = 0 Ax =p 
620 ye 22 (0) 
1G pes [B;].- xj 2< Bj > - 


Here [B;] is the greatest integer less than or equal to pj and < pj > is the smallest integer 
is the nearest integer in the left of fj and 


more than or equal to B;. Obviously [Bj] i | 

<fi > is the nearest integer in the right of pj. For example if Bj = 1.6 then [B;] = 1 and 
<B; >= 2. Sometimes [£ Al and < fj > are also called, respectively, the floor and ceiling 
of b; 

i: E LA 


z T we decide to identify problem (LP \ẹ by the kt node, then the above process is to 


I get two new nodes, node 2 and node 3. The 
ney | are mutually exclusive so that (LP)2 and (LP)3 
p< The optime solution of the given ILP lies either in (LP) 


yY + x Y 7 Í 
YW gi lil 
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example 6.5.1. Use branch and bound met 
subject to 


hod to solve the following ILP 


—X1 + 3X. <6 
7X1 + x3 < 35 

x, $7 

X 37 

X1,X2 2 0 


x; and xz integer. 


Solution The first step is to solve the associated LPP, (LP). The optimal solution of 
(LP) is (41 = 9/2, X2 = 7/2) and the optimal value is zı = 63. As the solution of (LP); 
does not meet the integer requirements, it is not optimal for the given ILP. However, 
4 = 63 is an upper bound for the optimal objective function value of the given ILP. 
problem (LP); is depicted as node 1 in Fig 6.5. 

The next step of the branch and bound method is to branch from node 1 via a 

variable x; which is constrained to be integer but is currently having a fractional value. 

in our example, xı and x2 both are constrained to be integers and both are having a 

factional value so we may branch from node 1 via xı or x2. In practice, we choose that 1 

for which the fractional part of such x; is positive most, with the hope that this way we 

may get a deeper cut. Here both x; and x2 have equal fractional part so we can branch 

= either from x; or X2. If we decide to branch from x1, then for By = 9/2, [Bi] = 4 and 
< ßı >= 5 and therefore we get two new LP’s, (LP)2 and (LP)3, as 


(LP)2 (LP)3 

Same as (LP); but with Same as (LP); but with 

one additional constraint one additional constraint 
x1 <A. ee | 2 Ole 


These are j i 3. Solving (LP)2 we get (x; = 4,x2 = 10/3) and 
| are identified as node 2 and oe ke ~ 5 x = 0) and z3 = 35. As the solution of 


2 = 58. Similarly solving (LP)3 we get 
(LP); meets the deae Sie TA, the value z3 = 35 becomes a lower bound for the 
final abiect; j LP. 
optimal objective function value of the given l j 
At this a ji a check the objective function value of other LP s. In es sn te 
NTN one sel to be checked, 1.e node 2. As the objective function value 0 > 


‘more th han this 1 bound zs = 35, this node is not fathomed and it is considered for 
| ex ú all this Lower 5, igor , 













; | f (LP)2 been less than the lower bound z3 = 35, it 
Had the value O° Si * — ¢ », = 0) would have been taken as an 
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pe choose the 
seen fathomed and so we have ~ 
function value. This gives (Y1 


Now all nodes have l 
l ,o™ 
objective function value Z 


with the maximum objective 
solution of the given ILP and the a 
We depict the above details 1n Fig 6.0. 


aa Z” = 5D. 





29/2, Kye? h 
z ;=63 (upper bound) 


node 3 Xx, =5, X;=0 


—_ —— 


Le, / z ,=35 (lower bound) 
iss ee Infeasible 





Fig. 6.6. 


Example 6.5.2. Use branch and bound method to solve the following integer LPP 
Max z = 3x1 + 4x0 
subject to 

Z% h NO Xp es 2 
3X1 = 2X2 <9 
X1,x2 20 
xı and x2 integer. 


Solution Following BBM we obtain Fig. 6.7 
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k = 4, X3 = 3) as an optimal 


Now all the nodes hav 

eia). A 4 Fe soon peasatcrd an optimal solution of the give? 
a on | +> If we plot the feasible regio" 

| ee at the optimal solution of the ILP iS 

a n the branch xz > 2. Fig 68 
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XI = INE AS Nees = 1 
w= 15 


Infeasible 
X1 = Dy X? = 1 
Z4 = 13 (lower bound) 


Xi = LOW 
Zg = 14 


Zo = 96/7 


Fig. 6.7. 
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Fig. 6.8. 


city of trucks which may vary among certain standard 
can take only one of k specified 


al constraint can mathematically 


values e.g. load carrying capa 
capacities. Let it be known that the variable x; 


values, namely, Qj, Q2j,--: and aj. Then this physic 
be modeled as 


| j = 0101; + 6202) +... + ÔkQkj 
Ô] + O9+...+ Be =! 
6; > 0 and a (ea 


k 
As yo: = 1 and, ô; > 0 and integer for (i = 1,2,...,k), only one of 6; will take the 


i=] 
value 1 and others have to be zero. This forces the variable x; to take only one of 
the designated values &1j, @j,... and aj. 





| (b) Problems with Either Or Constraint 
| | Suppose we are given the following optimization problem 
| Max Z=X, +X 
| | subject to 

(3x, +x% <4 

. or 
2x1 + 3x2 <5) (6.29) 
X1, X2 = 0. 







e problem (6.29) cannot 
it shoul have been obvious 





a 
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Fig. 6.9. 


_ We now assume that an upper bound of the functions (3x1 +x2—4) and (2x; +3x2—5) 


over the region (x; > 0, x2 2 0) is known. If no such bound is known then we can 
take a very large positive number M 


such that for all (xı > 0,x2 > 0) we have 
Butr x 4) < M and (2x1 + 3x2 — 5) < M. To be specific let us choose M = 100 for 
our example and construct the following integer linear programming problem 
ms Max Z= Xr+ xX 
hie: subject to 


3x1 + x2 -4-1005 < 0 

2X1 + 3x2 — 5 — 100(1 - 6) < 0 

On 1 

x1,X2,0 > 0 

“ae Ô integer . 


(6.30) 


i n problem (6.30) is equivalent to problem (6.29) because as per the constraints 
of (6.30), ô can take only two values, namely ‘zero’ and ’one’. For ô = 1, the first 
straint is satisfied automatically since (3x; + x2 — 4) < 100 (upper bound), and 
he sec ond constraint reduces to 2x; + 3x. < 5. For ô = 0, the role is reversed and 
erefore we get the same problem as problem (6.29) 
is context it is obvious that if the constraints are given in ‘>’ form rather than 
er b und, we need to lower bound of the constraint functions because if M is an 

er bound of g(x) , then —M is a lower bound of —¢(x) over the same region. 


ecified Number ‘p’ out of ‘m’ Given Constraints aa, 
he given constraints be g(x) < 0, (i = 1,2,...,m) and we wish that p (any p 


ihe specified p constraints) of these m constraints should hold. Let M; be an 


AD i rb | | l i; 
ound for the constraint function gj, i.e. gi(x) < Mi for x € R”. Then the set 
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91 (x) = M64 < 0 
g(x) ca M202 <0 


m(x) — MmÖm < 0 
mi 


> ôi=m-p 
i=] 


0<6)<1 
and ô; integer, (i = 1,2,.. Pitt). 
model the physical situation that p out of m given constraints hold. 


6.7 Zero-One Implicit Enumeration Algorithm 


In the last section, we have seen few examples of 0 — 1 ILP’s. Though these problems 
could also be solved by any general ILP m 


bound methods, we have certain special alg 


In principle, the additive algorithm is a special case of the general 


To apply the additive algorithm, we assume that the given 0 
following requirements 


(i) The objective function is 
ficients cj = 0. 
(ii) All the constraints are given in ‘<’ form. This necessarily means that some b; may 


be less than or equal to zero. These constraints are then converted to equations by 
introducing slack variables s;, such that Si => 0 for all i. 


— 1 ILP satisfies the 


given in the minimization form, i.e. min cT 





x, with all coef- 


a 
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atan We observe that (1) Max 3X1 =5x9 is ec 


i Wivalent to ~Min (~ = 
valent to Yı + X2 < 5 and -X1 =x < Min (~3x; +5x2) (ii) xi +x2 = 


js equi } =5 and (iii) 4x is adui 
ae ~ 6x) = -4. Therefore problem (6.31) is eas $ a 
Min wo — 3X4 oe 5X5 | 
subject to 
mS G X? + S> = =f 
4x1 — 6x2 + 53 = —4 (6.32) 
*1,X2 both are 0 or 1 ~~’ 
S1 2 0, s2 > 0, Sa > 0. 
The only problem now is that the coefficient of xı in the objective function is —3 DS 


which is less than zero. But if we substitute x, = (1 — x;) in the objective function and 
adjust the R.H.S of the constraints of (6.32) accordingly we get the problem 

Min w = 3x! + 5x2 

subject to 


A otsi 
r E o — —4 
Ax, — 6x) +53, =0 
EEO) both are 0 or 1 
S1 > 0,52 > 0, s3 > 0. 


where w = W + 3, and minimizing w is same as minimizing w’. This problem is in the 
standard 0 — 1 ILP form. 

The branching strategy of the additive algorithm is again based on the use of a 
branching variable x;. Here the two branches are corresponding to xj = 0 and x; = 1 as 
xj is a binary variable. The bounding strategy is very similar to the branch and bound 
method. Hence an improved integer solution provides an upper bound on the minimum 


value of the objective function. 
In any subproblem, the fathoming can occur in any one of the following three ways 


(i) The subproblem cannot lead to a feasible solution. 
(i) The subproblem cannot yield a better upper bound. 


(iii) The subproblem leads to a feasible integer solution. 


We now illustrate the working of the additive algorithm. 
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~1ILP by the additive algorithm 
Example 6.7.2. Solve the following 0 = 1 
Max w = 3y; + 2y2 ~ 53 ~ 244 + 3Y5 


subject to 


yi + yo + yst 2ya + ¥5 S4 
7y1 + 3y3- 4y4 + 3y5 $ 8 
11y1 — 6y2+ 3y4 — 3Y5 2 3 

Yi, Y2, Y3, Y4, Y5 all 0 or 1. 


Solution The standard 0 — 1 ILP form of the above problem is 
Min z = 3x1 + 2x2 + 5x3 + 2x4 + 3x5 
subject to 


=X] — X2 + X3 + 2x4 — X5 +51 = 1 
—~7x1 + 3x3 — 4x4 — 3x5 +82 = —2 
11x1 — 6x2 — 3x4 — 3x5 +53 = —1 (6.33) 
wAn, Xa, Xa X5 all U or I 
S1,$2,83 all =O, 





where yj = 1- x1, Y2 =1—%2, Y3 = X3, Y4 = X4, Y5 = 1- x5 and w =z -8. 

Now we explain the method. As in problem (6.33), we seek the minimization of a 
linear objective function with all coefficients non-negative, a logical starting solution 
should consists of all-zero binary variables. In this case the slacks will act as basic 
variables whose values will be b;’s given on the R.H.S. This gives the following tableau 


Basic 






S1 S2  s3/solution 






Objective 
coefficients 


Now given an initial all-zero binary solution, the ass 
(1,—2,-1), z = 0. In case all the slacks were non- 
declared all-zero binary solution as optimal. 
ee as some LAS are negative hai 


YV yV’ a lec) "649 or 
A, J S TOU Jé va JADI O e a jO achi leve | | 


ociated slack solution is (s1, 52,53) = 
negative, we would have stopped and 


we need to upgrade one or more 
sibili ty or to declare that the problem 


<= Ce a 
aii 4 
Jit 













lly one ol 3) oa " 4 
Y one ot the e bi lary variables from ‘zero’ 
: J 
d the bre e variable and its 
n 1 be ei 


IC) gA t] ne oh infe ered of slacks. 
: 10t | as a branching 
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le, because its coefficient in the second 
“ refore setting X3 = 1 can only make sy and 
f pjes, namely X1, X2, X4 and xs, there is 


sA MO 


and third constraints is non- 


S3 More infeasible. For other r 
at least one n 


negative, 
emaining 


aint 2 and constraint 3, so egative coefficient in the 
constté int a » SO they are the Possible candidat 
“she value ‘one. ates for the elevation 


a — aes ible . 

Amongst the S > ne for branching (e.g. x 05) 4 A xe in the 03 

j) Amoo” l - X1, X2, e given 
example), We ONA A E of slack infeasibility. This measure ie based 
ae Rit Fe | ased on 

the as umption O value variable x j Will be elevated to the value ‘one’ and 

it is defined as 


a l; = im min(0, si — aj), 

a, | all i 

tern s: j current v : : 

| where s; is the alue of the slack and Aij 1s the coefficient of the variable Xj 


i iie: constraint. It can be shown that I j has an equivalent expression, namely 


pi 


[= sum of the negative slacks resulting from elevating x; to value ‘one’ 


í 


E: = Ý (negative Si value given xj = 1). (6.34) 
“a all i 


















































Inour example when we set x; = 1 we get sı = 1 — (-1) = 2, s2 = —2 — (-7) = 5 and 
$= 1- 11 = -12. Thus lı = —12. Similarly h = -2, I, = —1 and Is = 0. We do not 
compute Iz as x3 is excluded as a branching variable. Because I5 yields the smallest 
measure of slack infeasibility, x5 is selected as a branching variable. In other words, 
we compute the largest of I;’s and then decide the branching variable. So from node 
0 (z=0) we branch to two new nodes, node 1 and node 2 via the branching variable 
ts, by taking x5 = 0 on one branch and xs = 1 on the other branch. At node 1, we 
have slacks (S1,82,53) = (2,1, 2) and z = 3. Thus node 1 is fathomed and z = 3 is the 
current upper bound for the optimal objective function value. At node 2, we have 
slacks (s4, s2,53) = (-1,2,—1), z = 0, which is infeasible. Now the variables x1, x2, x3 
and x4 are the candidates for the branching variable from node 2. Here we note that 
though solutions at node 2 and node 0 are identical, node 2 1s different because %5 is 
10 longer a branching candidate. Now for node 2, x3 is not promising ae 1t does 
Not move « or sa t d feasibility. The variables x; and x3 are also no promising 
me oo OWT J] yield a worst objective function 
their objective coefficients (3 and 5) will y iy 
their obj the remaining variables x2 and x4 
Behn rant oner bound (z= 3). For the tia 
SSIES are ala DS iniiai: aes branching variable at node 2. 


oe Aana 
i a 


AA wa. 
"AY g ATION 
“UUL I 





B 
fix fn 





~~ 2 
FT 
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2 
sa aara 
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z=0 





z=0 z=3 (fathomed) 


Fig. 6.10. 


At node 4, we have x5 = x4 = 0, which gives (s1,52,53) = (1, —2, —1), z = 0. The vari- 
ables xı and x3 are excluded by the upper bound test. In fact, x3 can also be excluded 
because it cannot reduce slack infeasibility. The remaining variable x2 cannot be ex- 
cluded by any of these two tests so xz is chosen as the branching variable. At node 5, 
we have (51,52,53) = (2,—2,5), z = 2. At this node xı and x3 are branching candidates 
but they are excluded by the two tests discussed above. So node 5 is fathomed. Also 
node 6 is fathomed because neither x; nor x3 can produce a better feasible solution. 
This gives the optimal solution as x5 = 1, other x; = 0 with the optimal value as z = 6h 
given at node 2. Now changing to y; etc., we get yi = 1, y2 =1, y3 =0, yg =0, ys =0 
with w = —(z— 8) = -(3 - 8) = 5. 






Z=) 


X5 =0 X5=1 


z=0 oe ihe z=3 (fathomed) 
TO 


(ae Z=2 (fathomed) 
al a KSI 


J 2 
. á Pe, re 
f \ 
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18 summary and Additional N otes 


„n most basic approaches, namely the cut 

‘ sun approach, for solving ILP’s are disc 

p E.Gomory developed the (fractional) 
p. t here in section 6.3. 


The cutting plane method for MILP was given by R.E.Gomory in 1960 which we 
giscuss here in section 6.4. The first branch and bound method for solving ILP’s was 
given by A.H.Land, and A. G. Doig in 1960 and the same is presented here in Section 


ting plane approach and the branch and 
ussed in this chapter. 


cutting plane method for AILP in 1958 which 


» The implicit 0-1 enumerative algorithm discussed here in section 6.7 was given by E. 
Balas in 1969. 

, There are now many other (improved) cutting plane and, branch and bound methods 
available in the literature. For these developments and other related topics we may 
refer to certain well known texts on ILP’s e.g. Gerfinkel and Nemhauser [65], Hu 
i78], Salkin [138], Zionts [172] and Taha [155]. 

e There is all together a different approach for solving ILP’s which uses the theory 

of finite (additive) Abelian groups. The basic work on the group theoretic approach 

for solving ILP’s is due to D.S. Chen who submitted his thesis in the same title in 

1970 to the department of Industrial Engineering at SUNY (Buffalo). An elementary 

exposition of the same is available in Chen and Zionts [36], and Zionts [172]. 

Integer programming problems have many applications in capital budgeting, sequenc- 

ing, and scheduling (e.g. airline crew scheduling). Some famous ILP’s are the knap- 

sack problem and the traveling sales person problem. The well known fixed charge 
problem can be transformed to a ILP. A general separable programming problem 
can always be approximated by a 0-1 integer linear programming problem. The text 


by Taha [155] gives details of these applications of TPs. | 
* The problems of set covering and matching which occur in the area of the graph 


theory can be modeled as ILP’s, e.g. Papadimitriou and Steiglitz [124]. 





6.9 Exercises 


a 61 Solve the following ILP ’s graphically 


z= XTA 







(0 ve Y ar 
eo i CA PN 
` nms S i. 
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(2) Min z = 10x; + 9X2 
subject to 

x1 <8 

x2 < 10 

5x1 + 3x2 S 45 

X1 X2 20 

xz integer. 


(3) Mar zZ = 4x1 + 3x2 
subject to 
3x1 + 4x2 < I2 
4x, + 2x2 < 9 
Xi, Xp 2 0 
xı and x2 integers. 


(4) Mae 2=4%) + 3x2 
subject to 
3x, + 4x2 < 12 
4x, + 2x2 <9 
Rape 0 
xX, nteger. 


6.2 Consider the ILP 


Maz z = 15x, + 32x2 
subject to 
7X1 + 16x2 < 52 
3% = 2x9 <9 
X1, X2 > 0 





Xı and xz integers. 
Í; moive he above ILP ee 
2. ice i ti he optimal solution of the gssociated LEP Ti called the relared LPP) 


~ se ' 
t, O/ 2 ! as t tne optimal solution ob- 
hi ep e 


E Pee m vethod an l also by branch «nd 












d an 
netnod | and 
e "ae 
<n 





J 
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6.6 Formulate the following into an equivalent ILP 


Max z = 3X, + 7X2 + 5x3 
subject to 
2x1 + 5x2 + 6x3 < 35 
5x1 + 9X2 2 3 
Ny =/ ion lo Or 25 
Xo > 0 and integer. 


6.7 Formulate the mixed integer programming problem equivalent to the following prob- 
lem 


Maz z = 3x, + 4x 
subject to 
either (x1 < 4 and x2 > 5) 
or (x1 > 5 and x2 < 2). 


6.8 Formulate the equivalent mized int 


problem eger programming problem for the following 


Mar z=5x, + 6x2. + 2x3 


where (x1,X2,x3) satisfies at most 2 of the following 4 constraints 


X1 + 2X9 + x3 <5 
3x1 — 7X2 + x3 < 9 
2x1 + 3x2 — 7x3 < 4 

-9x1 + 7x2 + 8x3 < 5 
X1, X2, X3 > 0. 


Mate the equivalent mized integer programming problem for the followi 
| ollowing 


aar: = 


5 S I 
v ee 


zA oS hej : a 1M A 
eX | ‘od 
As ae A 





as | ee. Ss G aas 
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Mat z= 4x + 3x, 
subject to 
Ixy + x2] > 2° 
xl < 2. 


6.11 Consider the f ollowing optimal simplex tableau of the associated LPP for a given 
AILP (with maximization form and x3,x4 as slack variables ) 






X1 XQ X3 KA 

xı = 11/2|1 0 11/36 -1/36 
xə = 9/210 1 —1/12 -1/12 
Z=23/410 0 7/12 1/4 


Generate the Gomory’s cut constraint through x, and hence solve the given AILP. Also 
show the actual cut constraints graphically. 
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7 
Convex Optimization and Quadratic 
Programming 


SS ee ee ee ee Sa 


7.1 Introduction 


A common framework of our studies in earlier chapters has been the presence of linearity 
structure on the given optimization problem, which not only gave beautiful mathematical 
results but also helped greatly in its algorithmic development. Therefore it may not be 
wrong to say that the success story of linear programming has its roots in the underlying 
structure of linearity. However most of the real world applications lead to optimization 
problems which are inherently nonlinear and therefore are void of linearity structure. 
Fortunately most often this nonlinearity is of ‘parabola’ type, leading to the convexity 
structure which can also be exploited to study such nonlinear optimization problems. 
Our basic aim in this chapter is to understand the convex optimization problems, i.e. 
those optimization problems which have the structure of ‘convexity’. These problems 
are best understood in terms of the convexity /concavity of the objective and constraint 
functions. 

In the later part of this chapter, we also make an algorithmic study of a special type 
of convex programming problem, namely, the quadratic programming problem. 


7.2 Convex pausrceions and their Properties 


There i is a vast literature on convex sets and convex functions. In the very brief intro- 
ee h to present here, we restrict ourselves only to those properties 
PASETAN nic = meas gi are re lated to the > type of finite dimensional optimization 








unce ion ns) whi C i) 


P ¥ $ 
7 s Aa, e! h 
troduced in Una pte or 


ass ser be a conver set and T SOR. 
Rane S on 0 ud for a all 0<A <1, we have 


(7.1) 


Mab 
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y x) 


a te ee | 





| 

| 

| 

| | 

| | 

| | 

| | 

| | 

| | 

| | 

l | 

l | 

l | 

D . A E 
0 xX $ u x 
Fig. 7.1. 





Let us try to understand the geometrical meaning of the inequality (7.1) by referring to 
Fig. 7.1. 

Here in Fig. 7.1, we have shown the shape of a typical convex function Fac — R. 
Let = Ax + (1 — À)u for a particular choice of A = A, 0 < À < 1. Then the co-ordinates 
of the point B are (%, f(%)) and the LHS of inequality (7.1) is the height AB. The RHS of 
inequality (7.1), being the weighted mean of heights PD and QE with the same weights 
A and (1— A), is the height AC. In inequality (7.1), LHS being less than or equal to the 
RHS, means AB< AC. This has to be true for all x, u in the domain and therefore it 
is possible only when the line segment PQ lies above the graph of the function y = f(x) 
between P and Q. Therefore a function f is convex if for any two points P and Q on the 
n ve (gray h of the function f), the line segment joining P and Q is always on or above 

_ the curve between P and Q but never below the curve. 
Í above discussion it is simple to observe that the following functions 
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ae A 
(ii) f(x) =|x, xeR KIM i 
ay TEL EER Vii 
(iv) fo) =e, xeR | Í HE | | 
©) fix ==Vi = 32 -1 <x<1 | | 
: tion 7-2.2. (C ; oiy. A 
opp 3 ee s prs Bi S C R” be a conver set and f : S —> R. | ih 
p ni: A x Uu E€ S and for all 0 < A << Aly we have Ait 
oe, = Mee A)f(u). (7.2) hl 
tion FER > R then ei Ni 
i that for all points P and Q on the curve(gr Hs ne aboye Inequality geometrically means i 


of the function f), the line se joini Wl 
i ) gment joining Hie 
P and Q is always on or below the curve between P and Q but never above the curve. Vid 


The shape of a typical concave function is shown in Fig. 7.2. 


f(x) 


Fig. 7.2. 





From the above discussion, it is obvious that f is a concave function if and only if ipl 
-f is a convex function. : | 
_ Therefore we can check that the following are concave functions : | at | 
ae |. 
haho E O T. ATR 
Se +e = < 
ies ae. HH +N — i 









_ ae 


ee f= -, xR. 
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f(x) 


Fig. 7.3. 


The examples of convex functions (respectively concave functions) given above are 
really strictly convex functions (respectively strictly concave functions). A convex func- 
tion which is not strictly convex has a typical shape as shown in Fig 7.3 

As an example of a strictly convex function, we may take the function D:R>R 
given by @(x) = Max(x*,x) which has the shape as given in Fig. 7.4. 





Fig. 7.4. 


From the above examples, we make note of the following 


1. Tf a function is both convex and concave, then it has to be a linear function. 





à function may be neither convex nor concave, e.g. f(x) = sin x F sxs 
AO, XER. 


o A ana T N EEN A N EEEE ELONE ee T EN 
he domain of a convex function has to b 


a convex set. This is because in the 


a 
Í 
E 
= 


function at R= Ax +(1-—A)u for all x,u € 5 


P : 
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and for all O < A < 1, so £ has to be in ) 
= concave function has also to be convex er domain S. Obviously the domain of a 
aA convex/concave function need not be di ad, 
but is not differentiable at x = 9. differentiable; e.g. f(x) = |x|, x € R is convex 


5, A convex function need not even be continuous, e.g 
5 ‘hye 


fla) = 1" “1<x<1 


7 = il 


is not o acs x 7 2 as can be seen from Fig 7.5. However, we can show that 
a convex S always continuous in the interior of its domain. Thus if f is 
convex Over a convex set S.C R”, then the points of discont 


be on the boundary of S. inuity (if any) could only 


IG- f(x) 





Fig. 7.5. 


6. Let f and g be two convex functions defined over a convex set S C R” then (a) 
f +g, (b) af (a > 0), and (c) h(x) = Max (f(x), g(x) ) are also convex functions. Thus 
x 


z 2 : 
-x+2x2 + |x| is a convex function for x € R. Also h(x) = Max(x,x*), x E€ R is a convex 
\ 


function (see Fig. 7.4). ll. Th 
7. Results similar to (6) above obviously hold for the concave functions as well. Thus 
Ffa functions defined over a convex set 5 C R” then so are 


a > 0 and h(x) = Min( f(x), 809). As an example, let 


m E and se Veg, (0) Sees Then the min function h 
<a ee ee he function —x — 2x7 +lnx, (x >O0)isa 
mE 


ER 








if f and g are concave 
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Fig. 7.6. 


Definition 7.2.3. (Epigraph). Let S C R” be a conver set and f :8S > R. Then the 
set Eş CR"*" given by 


Er= (œa) VES ZER, f(x) <a} 
ts called the epigraph of f. 


Definition 7.2.4. (Level Set /a—cut). Let S C R” be a convex set and f: SR. Let 
& ER. Then the set I'a C R” given by 


Tg={xES: f(x)<a} 
ts called the a—level set or the a—cut of the function f. 


Definition 7.2.5. (Hypograph). Let S C R” bea conver set and f : S +R. Then the 
set Gf C R"+! given by 





Gr={(%,a): xeES aeR f(x) >a} 
1s called the hypograph of f. 


A visualization of these sets for S € R is given in Fig. 7.7. We now have the following 
results. 
Theorem 7.2.1 Let $ C R” be 4 convex set and f:S—R. Then 


ae ee he f is a convex function 
ere << > if and only if its epigraph E f 28 a conver set. 










fe Ce ae) ASI PT EN a ee S PN TETT 
SSDSDlLLV } Let f pe convex on (a and fy PNY n 
i a oe On aad (r,e) and (u,B) € Ef. Then by the 
We have that far N - 1-1 ~~ ae 
dave that ftorQ<)A <1] ae 


Xie 
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- 1-A) 
Therefore (Ax + (1 =A)u, Aa + ( 
(ii) (Sufficiency) Let Ey be a convex set in R”"*, Let x,u € S. Then (x, f(x) ) EE, 





EE f and hence E f is a convex set. 


and (u, f(u) ) € Ey. As Ep is a convex set, we have for 0 < A S1, 
(Ax + (1—A)u, Af(x) + (1 = A)f(u)) € Ey 
i.e. 


Therefore f is a convex function on S. o 


Corollary 7.2.1 Let S C R” be a convex set and J5 > R. Then f is a concave 
function on S if and only if its hypograph G f is a conver set. 


Theorem 7.2.2 Let S C R" be a convex set and f: SR. Let f be a conver function, 
Then for alla € R, its a—level sets are CONVEL. 


i 
| 
flax + (1 - Ayu) < Af(x) + (1- A)f (u). : 


Proof. Let x,u € Tx (a € R). Then by the convexity of f, we have for 0 < A <1 E 
fAx+(l Au) Af@) + (1— AF) < Ae + (1-A)ja=a, 


Le. Ax+(1-A)jueTr a- Hence Ty is a convex set. Since @ is arbitrary, the result holds 
for alla eR. we E 


Remark 7.2.1 The converse of Theorem 7.2.2 is not true as can be seen by the example 
shown in Fig. 7.8. Here T, is convex for every a € R but fi 
In fact, those functions for which Ty is conver for everyaeR (e.g. Fig. 7 8) are called 
quast-convex functions, to be studied im the later part of the bo l 
tells that every conver function is quast-conver but the converse is 
counter example could be f(x) = x3 (x € R), which is a quasi- 
convex function (check this graphically). 





not true. A specific 
conver function but not a 


Differentiable Convex Functions 





We now present some of the 


—— oe Noh ee ee - E DERE 
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properties of differentiable and twice differentiable con- 
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Fig. 7.8. 
Proof. By the convexity of f on S, we have for 0< À <1, 
f(àx +(1- Aju) < Af(x)+ (1 -A)f(u), 
ie f (u+ A(x- u)) - fe) aa 
fee) Ss — a 
But we are given that f is differentiable on S, which by definition means 
: jah f(u +w) = f(u) + w" V f(u) + alu, w)llwll, (7.5) 
| 
| “tac u+weS and lima(u, w) = 0. Therefore using (7.5) in (7.4), we get 


„TV f(u) + alu A(x -Ale -uill = f0) 






flu) +A- 





4 iP - f(u) 2 


T u, aea — u))|\x — ull. 
eatin) VS (u) + ate — aaro 


T ee = Ad y AGT o 


N soy E m e A J 
‘OT Lilit OA 5 yes Ax 7 
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R" be an open conver set and f : S > R be differentiable. Le 
u € S, we have 







Corollary 7.2.2 Let SC 
f be a concave function on S. Then for all x, 


f(x) = fu) < (x - u) VF). 


— prn 


Remark 7.2.2 The converse of Theorem 7.2.3 18 also true. Because by (7.3) we have 


f(x) — FAx + (1 - A)u) = (1 - A) — u) V f(Ax + (1 — Aju) (7.6) 


and 
flu) — f(Ax + (1 - A)u) = -A(x — u) Vf (Ax + (1 - A)u) . (7.7) 


Multiplying (7.6) by A and (7.7) by (1 — A) and then adding we get 
f(Ax+(1-A)u) < Af (x) + (1 - A) f (u) - 


In view of the above remark we can take (7.3) as the definition of a differentiable 
convex function. This inequality tells that for a differentiable convex function, the lin- 
earlization f(u) + (x — u)! V f(u) at u never overestimate f(x) for any x € S. This is 
illustrated in Fig. 7.9. 2:43 pm 

In a similar manner, the definition of a differentiable concave function can also be 
interpreted geometrically. 


i rE rane — 


E E æ a mea a ma — r = Ey — 
n s è 
n 5 





Theorem 7.2.4 Let S C R” be an open conver set and f : S > R be differentiable. 
Then f is a convex function on S if and only if for all x,u E€ S, we have 


(x - u)"[Vf(x) - Vf(u)] > 0. (7.8) 
Proof. (i) (Necessity). For x,u € S, we have from Theorem 7.2.3 
f(x) — f(u) — (x — Vf (u) 0, 


and 





f(u) — f(x) -—(u—x)'VF(x) > 0. 
Adding these two inequalities, we get 


(x = WIV f(x) — V f(u)] > 0. 


j a ) aA] iat pe €S and 0 <A <1. Then by the mean value theorem, we 


=~ = oe 
or 
$ 








ne 


I 


Ay Yi 

1 ~ | pr by 
hesa AY y 
l ee 

= 


for some 0 <1 <1. (7.9) 
a 7 
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f(x) 


f(x) 


f(u) MOREE v f(u) 


Fig. 7.9. 


Remark 7.2.3 For a concave differentiable function, the inequality (7.8) becomes 
(x - u)! [V f(x) —Vf(u)] <0. 


Remark 7.2.4 If f : R — R, then inequality (7.8) means 


(x — w)( fa) - fw) 20 


ction which is the well known 
"TE > j Thus f’ is an increasing fun | W 
cf “he > a J Be r 7 a a nee function of real variable. So the inequality (7.8) 
Pa nition o conv 
A ki e of this basic result to R". 









r DO . : 
2 i 
t seen, ff e IN od 


cna motile 
ith the obvious paodibontion 


- E p ~ CIty 
agli fs sım 
A -i Pye E, 
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1 we state the following 
The toll ng 
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Theorem 7.2.5 Let S C R” be an open convex set and f : S > R be twice diff erentiable, 
Then f is a convex function on S if and only if the Hessian matric H¢(x) is positiy, 
semi-definite for all x € S. 


Let us recall that the Hessian matrix H,(x) is defined as the n xn matrix of second 
order partial derivatives, i.e. 


df 
i ae | 


The Hessian is always a symmetric matrix. This matrix being positive semi-definite 
means y’H;(x)y > 0 for all y € R”. 

In a similar manner, f is concave on S if and only if Hy(x) is negative semi-definite 
for all x € S, i.e. SE < 0 for all y € R”. 

Let us now recall that H f(x) is called positive definite if y'H f(x)y > 0 for all ye 
R”, y #0, and it is called negative definite if y'H fly < 0 for all ye R”, y #0. 


if Theorem 7.2.6 Let S C R” be an open convex set and f : S — R be twice differentiable. 
j If H¢(x) is positive definite for all x € S, then f is a strictly convex function on S. 
| 


In a similar manner, if H¢(x) is negative definite for all x € S then f is a strictly 
| concave function on S. 
| The converse of Theorem a 7.2.6 is not true, i.e. if f is a strictly convex function then, 
i H(x) may not be positive definite (though it is certainly positive semi-definite), e.g. 
i} fe) = xt, xeRisa strictly convex function on R but H f(x) = 12x? is not positive 
| definite for x = 0. 


| Example 7.2.1 Examine the convexity/strict-convezity of the functions 
| (i) Eroa) = 2x? + x2 + 4x12 and (ii) 4x? + x3 + Axxa. 


Solution. (i) f(x1, x2) = 2 + G + 4x1x2 gives the Hessian matrix as 


| A 3 
i) 2 2 


which is positive definite. Hence f is a strictly convex function. 
(ii) f(x1,x2) = 4x? + x + 4x1x2 gives the Hessian matrix as 


8 4 
4 2 
which i is positive semi-definite. Hence E is a convex function. 
mE in g reni ‘aral Eia Š sats R” = R is a qui yi. f TAU = s where Q is real 
symmetric. 1 aen we can chec K nat Vi (x) Si 20x : $ ar AC a it f(x) = = ere the nature 
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stve definite, then J is a stric i . i 
osit Vt meti T , re tly Convex function; if it ig positive S€ mi de finite the n 
Pr se is nverx UMC À lL it 5 e A ‘ ’ . l : s d ; d r a VS > 1c 7 > 
js a €O l d G 4 i negative definite then f is a strictly concave amoed if 
T aan negative sSenmm-aennite the cy) ONR, ' ob Y. © rave Tune ra 
ey ks Sones . ai IS a concave function; and if it is indefinite then f is 
ither a convex nor a Concave function s AU OS , 1s ; 

‘he easiest way to check if the matri n na i 

: s »/ne ative semi-definite ; DANNIN Q is positive definite/positive semi-definite /negative 
definite / ness ii ie finite /indefinite is to obtain the eigen values of Q. As Q is real 

tric. all its eigen values . AAS 
Sy mmes wea gen values A1, A2,...,A,, are real. Then following result is well known 
atrix theory 


ne 


jn M 
1. Qis positive definite if and only if all A; > 0. 
Q is positive semi definite if and only if all A; > 0. 
is negative definite if and only if all A; < 0. 
s negative semi-definite if and only if all A; < 0. 


2. 
3. 
4. Qi 

5 Q is indefinite if and only if some A; > 0 and some À; < 0. 


Q 
Q 
Q 


7.3 Convex Optimization Problems 


By a convex optimization problem we mean the minimization of a convex function f 
over a convex set S. Thus the optimization problem 


min f(x) (7.10) 


is a convex optimization problem if S c R” is a convex set and f is a convex function 


on S. 
If the problem is given in the maximization form 1.e. max f(x), subject to x € S then 


the problem will also be called as a convex optimization problem if S C R” is a convex 


set and f is a concave function on S. ns 
We now present certain basic results with regard to the convex optimization problem 


(7.10), the analogous results se hold with obvious modification 


for the maximization ca 
of changing the convexity of 


f to the concavity of f. 
3 Theorem 7.3.1 Let x be a local min point of the conver optimization problem (TLO): 
TE Then ¥ is also its global min point. 


ocal min point of (7.1 


f@< f(u) for all u € 


et Ns(x) NS. The theorem will then be 
j j 0 < À < 1 such that 
that fG f(x). Now, there certainly ee A, rib ees 
eee eo Na) nS: The Fig. (7.10) makes this assertion quite Clear. 
ee apota (7.10) we have fŒ < f(x’). This gives 


Zj This, by definition, implies 
Proof. We are given that x 1S aà l 0) ) i 

N 
that there exists ô > 0, such that NŒ) N S, where No(X) is a ball 
of radius 6 centered at X. 


Let x be an arbitrary point outside the s 
(x) < 
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I e Wn > g 
Ue 


Dee 


=A 
n a -y p 
> E a ee R a 
ae — AE y i A A 
q 5 à p i y <n 
m = A y 5 P 
r' P i n A if A “o LU g Ro f 
T W ES A f ya o a mi E " zi 4 
t jj f > S) ii hi aA n% a 1 
po eA kz De | i Pi 
ai a P > | + p a 
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N5(X) NS 


Fig. 7.10. 


But f is convex on S, and hence the above gives 


F(X) <Af(z) + (1-A)FR), 
i.e. 


Af (®) < Af (x), 





i.e. 


fŒ) < f(x)as0 <A < Ii 
which implies that 7 is a global min point. 


p ; 
roof. Let the set of all optimal solutions of problem ( 7.10) be denoted by V, i.e 


Vie F(X) < f(x) for all x € S}, 
= Au + (1 - Ayw, Os < 


J \ ey ia J 





Tmi: TER 
MM F e DV i ne LRhrVn~<+-._* 
ee T J N G (6) fl TH : 

t yad AI VOX 

y DA ON) D 










1 certainly belong to S because S is 4 





(7.11) 
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F(R) S F(x) for all xes, 


rus 2 is also optimal for (7.10), i.e. ge V. Therefore V is a convex set. o 


| .3 Let a nonc 
7.3.3 onstant convex function f be maximized over a convex set 


i interior point Stag re ; 
_ Then no pue a À of S can be a matımzing point; i.e. if a maximizing point 
orists, then it must be a boundary point of S. 


f. If there is NO Maximizing point of f over S, then the assertion is vacuously true. 

Let us therefore SSSuUME that f has a maximizing point x* over S. As f is nonconstant 

and x is a maximizing point, there certainly exists a point x € S such that f@) > AEJ 

Let z be an arbitrary point in the interior of S. The theorem will be proved if we can 

show that z can never be a maximizing point of f over S. By the definition of interior 

point, there exists ô > 0 such that N5(z) C S. Then as the below given figure illustrates, 
there is a point y E S and A, O < À <1 such that z = Ax + (1 — A)y. 





S 
Fig. 7.11. 
Now using the convexity of f over S, we get 
f(z) = fax + (1-A) 
aj < Af) + 1-A)FY) bi 
i aN Te (7.12) 


aes ’ y ô w a - j j j j j LPA 
B a aaan e being a marimar pora 
bes, CHR Gai ls cannot be a maximizing point of f 


i 


(See 
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minimizing) a conver function (respectively concave function) over a polytope S, then 


at least one corner point of the polytope 15 optimal. 
Theorem 7.3.4 Let S C R” be conver and f< 9+ R be a strictly conver function, 
Then there is a unique minimizing point of f over S. 


Proof. Let X and x (x + x) be two minimizing points of f over S, i.e. XES xe 


S, fŒ) = f(x) for all x € S, f(x") 2 f(x) for all x € 5, and f@)= fæ). 
Let T= AF + (1 — A) for 0 < À < 1. Then by the strict convexity of f we have 


TEIE f(Ax+(1-A)x) < ATR FG A) F(x"), 


Af(Z) < Af), 


i.e. 
fE) < f(x), 


which contradicts the assumption that x is a minimizing point. 


m 


7.4 Convex Programming Problems 


In many applications the set S over which the objective function f is to be minimized 
or maximized is not an abstract set in R” but rather it is prescribed by a finite number 
of constraint functions; i.e. the optimization problem is of the form 

Min f(x) 
subject to 


BAO SO i= 1,2. 3m (7-18) 





These problems are called mathematical programming problems and, as discussed in 
Chapter 1, are classified into two broad classes, namely linear programming problems 
and nonlinear programming problems. 


| While discussing the mathematics of the simplex method in Chapter 3, we noted 
$ tat we could develop simplex method only because of the linearity canei of LPP’s. 










linearity structure gave three very important properties for LPP’s, namely (P1) 
JIJE Ieas: J1 2 Te A Of) - DD d CO L. A o SE DE E j j j i 

egion is a convex set det). J. the given LPP has an optimal solution then 

C $a): 2 Ery 10cal optimal point is also a global 





<a i p = — 
(a kd - 
9 la g - ee ji 4 : 
SATS COO oyna... 
‘we ii 6 A ala >y 
Yo d&aid0 JDServec 
Fr EANN DS 
rr oa we 
: ; 


that, in general, there 
3 te. 4 
lems any of t ; 
Ton ete any of these properties hold 
ees which could solve every type 
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me h iding is "ra aoe for which some of the above properties 
OnNVED Pro 

ag proble programming problems. Basically a 

f a avit) es is e problem of form (7.13) where we prescribe ras is 

s a con nvex ; function over a lisa p = 1,2,...,m) so that we end up with 

§ denote the feasible region of (7. aH ; becomes a conven Sa 


S={xER": gix) <0, (i=1,2,...,m)}. 


4.1 Let for each i = 
Si i f ae Qi be a convex function. Then S is a 


S be the set of points x € R” for which the i" constraint holds, i.e. 


Si; ={xER": g(x) <0}. 


also o be identified as a a — cut or a—level set of the function gi for a= 0. As 
- function, all its a—cuts are convex set; so in particular 0-cut is a convex 
Si js a convex set. But S = 1S; is the finite intersection of convex sets and 

0 vex set. o 


1s 
i pa, 
pCi 


: of the above, if we wish to recast the mathematical programming problem 
93 a CO nvex zation problem, we need to ensure that the objective function 
ne constraint functions g; (i = 1,2,...,m) are convex functions. 


| 41 MEN Programming Bester), The pin problem 
a conver programming problem if f and gi (i = 1,2,. m) are convert. 


maximization form or the constraints are in * 
difications in the convexity requirements so that 
Let us remember that the bottom line is 
nvex set. The following table is self 


(a ) is given in the 2” 


1 make appropriate mo 
onvex ‘programming problem. 
ainimize a convex function over a Co 


pe 


H [i A| 








f(x) 


g(x) <0 (i=1, Dy nan p ttre 
N f is concave , i are convex) 


f(x) 


Max 
subject to 
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Example 7.4.1 Check if the following problem 
Max 4xı+ 3x2 
subject to 
x1 + X2 S 4 
w S 1 
Ki te 0 


is a convex programming problem. 


Solution Here f(x1, x2) = 4x1 + 3x2, g1(x1, X2) = (x1 + x2 — 4) and g2(x1, X2) = X1X ~ 
1, 93(X1,%2) = —xı, and g4(x1, X2) = —x2. Since f is a linear function which is being 
maximized it can certainly be taken as a concave function. Also the functions g1, 93 and 
84 are convex as they are linear functions of (x1, x2). So we have to check the convexity 
of the function g2 only. For this we compute its Hessian and get 


Mwl 
O = è i 


which is not positive semi-definite as its eigen values are +1 and —1. Thus 22 İs not a 

a function and so the given problem is not a convex programming problem. We are 

c aan abou the convexity of 81, 8&2 83, and 94 because we have taken the constraints 

in the ‘<’ form for defining g1, 82 83, and g4 here. 

ss a on of the above, there is no guarantee that the feasible region S is a convex set. 
it is really not a convex set as can be verified by plotting the give trai 

as shown in Fig. 7.12 es 





3 N A a p 
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, gxample 7.4.2 Check if the following problem is a conver programming problem. 


Max X2 
subject to 
xt + x2 
x; 


<1 
= X? 


Solution. It is a maximization problem and the objective function is linear so certainly 
concave. The constraint functions are gı (xı , X2) == x + x51 and g2(X1, x2) = —xi+ x2 with 


2 0 es 
Haad =| 3 and Heleva) =| 0 i 


Here Hg, is positive definite but Hg, is negative semi-definite; and hence g2 is not a 
convex function. Therefore the given problem is not a convex programming problem. In 
' this case again we get a non-convex feasible region S as illustrated in Fig 7.13 





: Fig. 7.13. 


nai Inthe above problem, if we change the second constraint to “a < x2, then ĝ2(x1, X2) = 
m out to be a convex function. This makes this new problem a convex 
-X2 comes out t i 
E ET R sible region Sı is a convex set. 


= Fai X hy pan 
x = D 7 P 4 i 
Faal a aLe Q) 
Noe le i h 

TT iQ DC 3 i we ~~ aM 
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By TOS 

De eee) VR ee 
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0 a) Yt 6 À {) | eid Bl Vf 


3 oo - d ANE m, l; 4 À y He ,€ n const Ut A | Ç \ gi(x) < 0 (i == it} 2 ERA 7 m 
~ertainly be a non-convex function. However, if 


ran af oi Ps bi OW 3A 
IF CT. OTI “UC TUU 
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TOPR 
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7.5 Optimality Conditions: Motivations from Elementary Calculus 


In this section we shall try to write optimality conditions for the nonlinear programming 
problem (7.13) taking clue from what we have studied earlier in the calculus course for 
minimizing/maximizing functions of one and two variables. As such, we ae not plan to 
give any mathematical proof here because optimality conditions for NLP’s are Studied 
in detail later in Chapter 9. 

Let us start with the unconstrained minimization/maximization of functions of one 
variable, i.e. the problem of the form ‘min f(x) over x € R’ or ‘max f(x) over x € R’. We 
assume that f is twice continuously differentiable over R. The following are the standard 
results in calculus. 


Theorem 7.5.1 Jf x* €R is a local min or local mar point of f over R then 7 Os 


Geometrically, the above theorem tells that at an optimal point (which may be loca] 
or global) the tangent line is parallel to the x—axis. If we agree to call those points x for 
which f’(x) = 0, as the critical points or the stationary points of the function f, then 
this implies that every optimal point of f over R is a stationary point. 

Obviously not every stationary point is an optimal point. Stationary points of f over 
R include points of local optima and points of inflexion. The above theorem is essentially 
a necessary condition for a point x* to be a local min point or local max point of f over 
R. We now state a sufficient condition in this regard. 


Theorem 7.5.2 (i) The point x* € R such that f’(x*) = 0, is an unconstrained local 
min point of f over R if f (x*) > 0. 
(ii) In a similar manner, the point x* € R such that f(x") = 0, is an unconstrained local 
maz point of f over R if f (x) < 0. 


Remark 7.5.1 In Theorem 7.5.2, when we are saying that it is a local min point, we 
really mean strict local min point; i.e. there exists a ô > 0 such that FOS < f(x) for all x 
satisfying |x -—x*| < ô, x $= x". Similarly the point x* satisfying the hypothesis of Theorem 
7.3.2 (ii) is really a strict local max point. 


We next take the case of a function of two real variables, i.e. f : R? > R such 


that f continuous second order partial derivatives over R2. The following are again 
standard results. 






a 5 ’ 
K) } T of 5 a 
Fa ie i As à 
£T } E 
i i A 
AN ’ q == i : 14 
=) | = og i » d 
Pa “ “ 
i pF pe” < , 7 
Dot; ye, ¥ 
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roduce the following notations, 
pe ve 
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of 
A= Fale’ B= ae c= FS 
aty") ai (x*,y") oy" yt) 


: peorem 7.5.4 (Sufficiency) 


+ apt ? : 
(i) Let (X,Y VERS satisfy (7.14). Also let 
trict local min point of f over RÈ. et A >Q and AC - B2 


: + apt D i 
(ii) Let (x*, y") € R® satisfy (7.14). Also let A , 
o strict local max point of f over R?. <0 and AC -B 


>0. Then (x,y) 18 a 
> 0. Then (x*, y") 18 


ine try vo understand here is that conceptually there is 
no dl 3 l ng a tunction of one variable or a function of two variables. 
Jn the ee of one variable, the necessary condition says that the tangent at x* is parallel 
to x-axis. For the case of two variables it is exactly the same interpretation, i.e. the 
tangent plane at (x", y*) is parallel to the xy-plane i.e. z =constant. 

Let us try to understand the geometric meaning of the sufficient condition now. In 
particular we consider Theorem 7.5.2. Given that f’’(x*) > 0 and f” is continuous, we 
eet that f” (x) > 0 in certain neighborhood of x*; i.e. the function looks like a parabola 
(U) locally around x" and therefore we are justified in declaring x" as a strict local 
min point. Similarly, f’’(x*) < 0 and f” is continuous, really mean that f looks like 
an inverted parabola (N) locally around x", so x" can be declared as a strict local max 
point. Thus the conditions of Theorem 7.9.2 essentially check the shape of the function 
f locally around x"; to be precise these conditions check if in the neighborhood of x, f 
is a strictly convex function or a strictly concave function. Exactly the same thing is 
being done in Theorem 7.5.4 as well. Because if we construct the matrix 


A B 
an K 4 
then the given conditions of Theorem 7.5.4 are essentially verifying that the matrix Q 
is positive definite or negative definite, i.e. f is a strictly convex function or a strictly 
concave function in a neighborhood of the point (x*, y*). So even if Theorem V5.8 and 
Theorem 7.5.4 look different, geometrically they are doing the same thing, checking 
the strict convexity /strict concavity of the ob jective function in the neighborhood of a 
Stati j 
y “ae a general unconstrained optimization problem, Min (or Max) 
f(x) over x € R”, then under appropriate differentiability assumptions, we can state the 
tolle eee edinasjofseatlier Theorems 7.5.3 and 7.5.4. 

ret x” eR" be a local min or local max point of f over R", 

E 


E g = 
bat V F(x*) = 0. Lf f is a strictly convex 
Ya nf x*. then x is a local mir 
orhood of x*, then x" is a local min 
' ina” | <4 y oo 








a KUE ee 
if ee a io 

A it MJ ITIC j p- AY f 
bi- DE api VLA CL. L. (Ar 


=e. : 
>» = Fi 
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Since, in practice, it may be difficult to check the shape of the function locally, Wie 
assume a general shape (convex/concave) over the entire domain R”. This Lives the 
following theorem. 

Theorem 7.5.7 (Sufficiency) Let x* € R” such that V f(x") = 0. Let f be a conver over 


R”, then x" is a global min point of f over R”. Similarly if f is a concave function over 
R”, then x* is a global max point of f over R”. 
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Constrained Optimization with Equality Constraints 


We next construct the constrained optimization problem where all constraints are 
given in ‘=’ form, i.e. we wish to optimize f(x) over the set S where 


S cae een) S2; m]. 


Traditionally, such problems have been solved by the classical method of Lagrange 
multipliers. Here we construct the Lagrange function or the Lagrangian as 


mi 
L(x, A1,A2,...,Am) = f(x) + X Aigi(x), x ER", A eR". 
=l 


Then we have the following theorem. 
Theorem 7.5.8 (Necessity) Let x* € R” be a local min or local mar point of f over the 
feasible set S = {x eR". RO) = 0E -,m) } where m < n. Let it be possible 
to choose a set of m variables Xi for which the Jacobian matriz J = gi 


has an 
; j Ox; J 
inverse. Then there exists a unique set of Lagrange multipliers Air Az, o roa ‘eh that 


VE) + Y A Vg) = 0. (7.15) 


i=] 
Remark 7.5.2 We need the invertibility of the Jacobian 
quires the application of the implicit function theorem. 





matrix because the proof re- 


Remark 7.5.3 The conditions (7.15 ) together with the constraints gi(x*) = 0 (i = 
$ 0,1,...,m) give a system of (n +m) equations for the determination of (n+m) unknowns 


Xir Xar- My 1E RE. E These conditions are essentially 
ThS OL 
as 






b = 
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3 ™ is unconstraine 
Also as AER” is unconstrained, we can take the Lagrangian also as 


m 


TN 2, Aigi(x), xER", A eR". 


i=] 


NX, Ay, A», SA 


us try to ars N : 
` AR ea ; ee us geometrical meaning of Theorem 7.5.8. For this 
Jet us na M = 2; ie. there are two equality constraints g(x) = 0 


m=0, xE R., Clear 3€ ; 
and g2( ) r early these two constraints describe two surfaces in three 


dimensional space. If there has to be any feasible solution, these two surfaces must 


intersect . É curve C. We can find the minimum value of the objective function by 
examining SPSUNeGS f(x) =constant=k (say) as the constant k is gradually decreased. 
The mtersection of any such surface f(x) = ko with the curve C is a set of feasible points 
where objective value i ko; and any local minimum on C must occur at a point P where 
the surface f(x) = kmin is tangent to C. 

Next, we note that Vf is normal to the surface f(%) = kmn, See Knin is constant. 
Thus Vf must also be normal to the curve C at the point P. The other two gradients Vg1 
and Vg2 are normal to surfaces g(x) = 0 and g(x) = 0 respectively, so both of them are 
also normal to C. So the plane normal to C at P contains all three gradient vectors, and 
therefore these must be linearly dependent. Thus there exist scalars @p,@1,@2 such that 
aV f +01V 91 + a2VQ2 = 0 and (ao, a1, a2) + 0. But the constraint surfaces g1 (x) = 0 and 
g(x) = 0 intersect to form a curve, i.e a1V91+a2V82 = 0 is possible only for a, = 0 = ap. 
Therefore aj # 0. Taking A, = A and À? = a we get conditions of the theorem. 


Theorem 7.5.9 (Sufficiency) Let (x*,A*) € R”"XR” exists such that V f Li A Vga) = 


0. Let Z(x*) = {z € R”: z Ye(x) = 0}, where g(x) = col( g(x), 82(X), «1 Sm(X)). Also 
let Hy(x*, A*) denote the Hessian of the Lagrangian at the point (x", A"). Furthermore let 
2 Hi (x*,A*)z > 0 for all z € Z(x*) with z # 0, then x* is a strict local min point of f 
subject to gi =0, (i =1,2,..-,m). 


In a similar manner, if z'H1(x*,A")z < 9 for all z € Z(x*) with z + 0, then x* is a 
strict local max point of f subject to to gi(x) = 0, (i= 1,2,...,m). 


Interpretation of Lagrange Multipliers 







; us con sider the no nl i) near p ‘ogr amming problem 
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Of (Xx) 
ey ee he wa) P (7.1 
Ai ob; x = x" 7) 
ls that the Lagrange multiplier À; gives the rate of change of 


the optimal attainable value of the objective function f(x) with respect to change in 
b;. Thus if the ith resource b; is changed to bi + Ab; then the objective function value iş 
expected to change by an amount A*Ab;. We have already read such a result for Lpp 


and the same holds for problems of type (7.16) as well. 


The above equation tel 


Example 7.5.1 Use method of Lagrange multipliers to solve 


L 3y 
— — — + 2x 
Min 3 5 
subject to 
x-y=0. 


Solution We have f(x, y) = x = ay + 2x and g(x, y) = x — y. Therefore the Lagrangian 


is 
ee 3yY? 
— — — +2x — y). 
7e +A(x—y) 

Evaluating the partial derivatives of the Lagrangian L(x, y, À) w.r.t x, y, À and equating 


each equal to zero we get 


L(x, Y, À) = 


xX +2+A=0 
—3y —A =0 
x-—y=0. 


The solutions of above system are (x = 2,y = 2,A = —6) and x=Ly=1,A=-8). We 
next compute the Hessian of the Lagrangian, i.e. 


ok FT. 

Ax2 Oxdy 
A (x, Y, À) = 

PL ZL 

andy Op 


—_ 
_— 
J 











2x 0 
0-3} 
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| gen In a similar manner we can check that (1,1) 
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‘ zt + = 0 me mee 
ao 2 VS y`) ans that (zı z2) = 0, where z = (21, 22)!. Then zl Hız = 


32 = z? > 0 for z + 0. Therefore by Theorem 7.5.9, the point (2,2) is a strict local 


is a strict local max point. We 
may also note here that the given problem has neither global min point nor global max 
paint because J(%,0) —> + 00 as x — co and f(x,0) > - œ as x > — œ. 








constrained Optimization with Inequality Constraints 


We consider the inequality constrained optimization problem 
Min f (x) 
subject to 
ee) Sb, 4 G=1,2,:.~,7) (7.18) 
and try to get the optimality condition for the same. Here we present a very elementary 
and non rigorous development of the Karush-Kuhn-Tucker (KKT) optimality conditions 
which is very much motivated by the method of Lagrange multipliers as discussed above. 
But we would like to emphasize that the KKT optimality conditions constitute a very 
important core topic in optimization which is important, both from theoretical as well 
as algorithmic point of view and on which we shall devote a full chapter later in the 


book. 
Looking at problem (7.18), it is very natural to convert each inequality constraint 


g(x) < b; into an equation by adding a squared slack variable = so that the given 
problem can be rewritten as 
Min f(x) 
subject to 
g(x) +s? =b; (i=1,2,...,m). (7.19) 
Now problem (7.19) is an optimization problem with equality constraints and so the 
method of Lagrange multipliers is applicable. Therefore we construct the Lagrangian 


a Sos L(x,s,A) = f(x) + 3 Ai(gi(x) + 5; — bi) 
> P i=] 


7.5.8 to write the necessary conditions at the optimal point (x", SA). 
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} unconstrained as per the statement of Theorem 7.5.8. However from ( 7.17) 
ene 7 e | ’ = bh, — $ because the constraint Ri(x) < h. 
we also know that A’ = Af(x)/ab’ where b; = bj — sọ be b, 
I 
| et 
been expressed as 9j(X) +s? = bi. an oe 
sites if “a increase the valie of b; (i.e. resource bj is available in bi +0b;, 0b; > 0, units) 
then because the constraints are ‘<’ type, the feasible region of (7.19) will be enlarged 
so we have more options now. Therefore the value of the objective function J) will 
improve and hence A* > 0. Also the equation (7.21) can be rewritten as A7(s*)* = 0, i.e, 
i . eye br ‘ 
A*(gi(x") — bi) = 0. Therefore the optimality conditions are 
l 


Vf) + Y AVgi)=0 
=I 


A. (gi(x’) — b;) = 0 G = 1,2Z,..2,3) 
Ox ) Ss b; G= ATE 
A; >0 (= 1 Dt ae 


These are precisely the celebrated Karush-Kuhn-Tucker (KKT) conditions for the in- 
equality constrained optimization problem (7.18). 

In the above we must note that as we are using ‘Theorem 7.5.8, so the condition of the 
invertibility of the Jacobian should also hold. This condition for the changed scenario 
leads to a well known concept of constraint qualification. We shal] have opportunity to 
discuss all these things in greater detail in Chapter 8 of the book. 


7.6 Quadratic Programming 





We introduced the class of convex programming problems in Section 7.4 and developed 


the KKT necessary /sufficient optimality conditions for the same in Section 7.5. Before 
developing algorithms for the general convex programming problem, here in this section 
we restrict ourselves to a very special case of convexity, i.e. the convexity of a posi- 


tive semi-definite quadratic form in n variables. This leads to a quadratic programming 
problem (QPP), i.e an optimization problem where we ar 
positive (negative) semi-definite quadratic form of n varia 





ay 


the constrain that the feasible region is a 


vilat 


convex set and the convexity of 
very lo cal mi 1 point is also a global min point. Here 
> not true that the optimal solution of a quadratic 
rner point or even at the boundary of the fea- 

ad | tea to construct examples of QPP’s in 

F int can be anywhere - be a corner 
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e are two main algori l 

| : Kaiet While OE mann abl These are Wolfe’s method and 

be minimized is positive definite, Beale’s “ah ey ead wen T : ryt eo 
i itive semi-definite. In our presentation here, we shal] discuss Wolfe's method only, 
mainly because of two reasons. Firstly, this algorithm gives an immediate application of 
she KKT conditions in the algorithmic development of nonlinear ak AN problems 
and secondly it uses only the Phase-I of the simplex algorithm with certain appropiate 
modifications. At the first reading, we may wonder how the simplex algorithm enters 


here because that is used to solve LPP’s, but if we think for a moment we get the 
answer immediately. As the objective function in QPP is a positive definite quadratic 
form and the constraints are linear, the KKT conditions will ‘almost’ be a system of 
linear equations 1n non-negative variables. We have already seen that the Phase-I of the 
simplex algorithm could be employed to solve a linear system for non-negative variables 
and therefore it is natural to employ the same to solve OPPs: 


7.7 Wolfe’s Method for Quadratic Programming 


We consider the following quadratic programming problem 


Max clx + x? Dx 
subject to 
Ax < b 
56 (UE (1.23) 


where c € R”, x € R”, b € R”, A = [aj] is an (mxn) matrix, and D = [djj] is an (nxn) 
negative semi-definite matrix. 

Here we note that D is taken to be negative semi-definite because problem (7.23) 
is in the maximization form. This makes the objective function a concave function as 
required to be for the maximization case. | 
he now write the KKT conditions for problem (7.23). For this we write problem 
(7.23) in the form in which the KKT conditions have been stated in the last section. 


i j pay RS mS i. I 
tms leads to 


B i ee ae oe 
subject to 
DL LJ CUu vv 


ey à 
+ sA gy acu 





2 > | 
Jn D i A = ij 
aA Fo % Sige 
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Since in the above problem, the minus sign is outside the minimization, we nee dt, 


solve the problem 
Min (-f(x)) 
subject to 


G(x) < 0 


with f and G as defined before. Then problems (7.23) and (7.25) will have the same 
optimal solutions, though the optimal value of the given problem (7.23) will be the 


negative of the optimal value of problem (7.25). 
Let A = (A1,A2,..-,Am)! and u = (U4, PZ jee be the Lagrange multipliers cor- 
responding to the constraints Ax — b < 0 and —x < 0 respectively. Then 


V(—f(x)) = -V f(x) = =c —2Dx, VG(x) = A and V(-Ix) = -I. 
Here VG(x) is to be understood as [Vei(x),..-,Vem(x)]‘, gi(x) being Quit AijXj — b;) for 


i = 1,2,...,m. The expression V(—Ix) = -I is understood similarly for the constraint 
-x < 0. Hence the KKT conditions for problem (7.25) are 


-c — 2Dx + AA -Iu =0 


n 
Ad) ay xib) =0 (i =1,2,...,m) 
E 





Hjxj=0 (j=1,2,.. n) 
Ax-b<0 


X, A, H > 0. (7.26) 
| n 
N ime P tee ) | a p 
ow defining s; = b; Aij Xj, (i = 1,2,...,m) ands = (CS ae Em). 


ae Zl we have from 
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=2D At oat | 0 i C 
A 0 0 | Ll = (5) 
S 
XAS > 0 
Àisi=0 (i = 1,2,...,m) 
ujx;=0 (J =1,2,...,n). (7.27) 





Now we recall the sufficient part of KKT Theorem and infer that if (x*,A*, y",s") is a 
solution of the above KKT system (7.27) then x* isa global min point of problem (7 25) 
or equivalently x" is a global max point of problem (7.23). So now the main goal should 
be to solve the given KKT system (7.27) efficiently. For this we note that the KKT sys- 
tem consists of two structurally different subsystems. The first subsystem is essentially 
a system of linear equations in non-negative variables x, À, u and s, which can always 
be solved by employing the Phase-I of the simplex algorithm. The second subsystem 
consists of is the complementary slackness conditions Ajsj; = 0 = ujXj for all i and f, 
which defines a system of nonlinear equations. Though the first system is easy to solve 
(by using the Phase-I of the simplex algorithm) its solution need not satisfy the comple- 
mentary slackness conditions, namely, Ajs; = 0 = ujxj for all i and j. However looking 
at these conditions carefully, we note that these conditions imply that 


A; > 0 = si = 0 (s; > 0 = Ai = 9), 


and , 
To xj = 0%) > 0 => Hp = 0). | 
Therefore, in the absence of degeneracy, the complementary slackness conditions imply 
that while solving the first subsystem by the Phase-I of the simplex algorithm we should 
not make A; and s; as basic variables at the same time, and similarly we should not make 
Hj and Xj as basic variables at the same time. In other words, we should ne as 
an entering variable at the current iteration only when we are sure that si yey ea 
non-basic variable in the next tableau. Similar arguments hold for “ee variables 4 i 
and xj as well. Thus in any simplex tableau only one of Aj and s; (for the same i) an 
a . basic variable. 
only one of u; and x; (for the same j) should be a 7 oo ai 
ae a -. we should explore if such a restricted entry sumpter a gorithm 
htt OES o . | a itry because of the restriction that only one of A; 
D far can be a basic variable at a 


a 
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Theorem 7.7.1 If D is negative definite then the quadratic programming Problem 
(7.28) can not have an unbounded solution. 





Proof. Take any finite r ( > 0 ) and consider any point x lying on the hyper Sphere 
|x|| = r. Then x = r d where d is a point on the unit hyper sphere with center as Origin 
Now 


x! Dx = r7d' Dd 


< Max d! Dd 
I|d||=1 


= rade Ddo (say) 

= mo (say) 

<0. (7.28) 
Here it may be noted that mo definitely exists because d! Dd is a continuous function of 
d and {d : ||d|| = 1} is a closed bounded set of R”. Further do # 0 and so by the negative 
definiteness of D, mọ is less than zero. Thus, as ||x|| — +00, x7 Dx => —oo, 

Further, for a non-zero x, we have 
f(x) =c'x + x’ Dx 


= x™Dx{1 + ee) 





But 


cl x 


x? Dx 


il 
r 
1 
<- Ao 
: r |d|=1 
| 1 


cld 
dT Dd 

















cle 


d™Dd 
cld 1 
d Day 











-m 
= Te (say). 


fe 































. * . T 
Here dı is the point at which the maximum of |£ d is attained and mı is the 
d' Dd 
ee wir, 


esuit of Theorem 7.7.1 follows immediately from 
problem (1.23) 18 a subset of R”. S 
S 
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he | ds 2k oe 
For this we have the following example may not hold if D is negative semi-definite. 
gxample 7.7.1 Use Wolfe’s method to soly 


e the following OPP 
subject to 
x1 -%2 $1 
Xi, X2 2 0. 


Solution We have x = (x1,x2)',c= (12) ob = (1),A =[1,-1] and 


o-(# 2). 
Obviously D is negative semi-definite. Now observe that for any @ > 0, 
x = 0,0 = 20 
is a feasible solution of the problem and the value of the objective function is 
z= 0+20-—(20-20) =30. 


Thus as 9 — œ, z > œ. So the given problem has an unbounded solution. 
The above example suggests that possibly Theorem 7.7.1 is true even for the negative 
semi-definite case, provided c = 0. This is correct and readers may like to verify the same. 
In view of Theorem 7.7.1, some finite feasible point x* must be a global maximum of 
problem (7.23). Hence by the necessary part of the KKT theorem, it is necessary that 
x satisfies the corresponding KKT system. Thus the above KKT system definitely has 
a solution. 


Remark 7.7.1 In case D is negative definite, the objective function of the quadratic 
programming problem (7.23 ) is strictly concave and so if it has a in ae it 
has unique optimal solution. However when D is negatwe semi-definite t 3 0 ere A 
that it has bounded optimal solution we not only need that the ee ry Besa 
feasible but also require that c = 0 (refer to the pr oof of oe ee ý cided s/h) 
we expect that Wolfe’s method will converge for the case (i) D is neg 


D is negative semi-definite with c = 9, and this is really true. 


We now describe Wolfe’s algorithm to fnd a solution of the KKT system. 








[m n 


d: A Stepwise Description 


ee AA solving the quadratic programmıng 
a assume that the given QPP has a feasi- 


"sn the Phase-l of the usual simplex 


p P 
ee 
CIT E Ok 
fo ALLI S \ 
=f 
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Step 1 Ignore the complementary slackness conditions in the KKT system (7.27) an d 
consider the remaining system of linear equations in the non-negative variables x, 4, u 
ds. i l 

ip 2 Check that in the linear system, all components of c and b are non-negative, Jf 
some component of c or b is negative, multiply the corresponding equation by ~, | 
Step 3 After performing Step 2, add appropriate number of artificial variables Xa, to | 
get an identity matrix of order (m + n). Construct the Phase-I problem e the usual 
objective function as — } k Xa. At this stage we should note that ae e initial b.f.s | 
of the Phase-I problem so constructed, the desired complementary slackness conditions 

| 

| 





(Ais; = p;x; = 0) for (i = 1,2,...,m, j =1,2,...,n) can be assumed to hold automatically, 
This is because, if need be, we can add artificial variables to all constraints, even though 
an identity column is present, so that the initial b.f.s consist of artificial variables only, 
Step 4 Solve the problem in Step 3 by the restricted basis entry method. (This method | 
is the same as the usual simplex method, except that while entering a column, choose | 
the one for which the relative cost Zj — Cj is negative and which does not make both A; 
and s; or uj and x; as basic variables at the same time. ) 
step 5 Stop when either all relative costs are non-negative or it is not possible to enter 
a non basic column with a negative reduced cost, without violating the complementary 
slackness conditions. The basic solution so obtained will be the optimal solution of the 
given quadratic programming problem (see the convergence theorem, Theorem Falco) 

The above step will be justified provided we prove that at the end of Step 5, all 
artificial variables will be at the zero level. Before we prove this, we wish to illustrate 
the algorithm with the help of following examples. 


Example 7.7.2 Use Wolfe’s method to solve the following QPP 
Maz Z= xı +x — x? + 2xyx — Dig 
subject to 
2x1 +X. <1 
Mi, xo 20! 


e Bees = ida)" c= (1,1)",b = (1),A 12 7] and D = Par: 
Clearly D is negative definite. N ow, noting that A and s will have only one component 


each and u will have two components, we have the following KKT system for the given 
QPP 


(7.29) 
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} 
X1,X ; | 
ly 2: Ay, My, Uo, S1 > 0, | 
$ 
— 
Ais) = MIX, = lox = Q, | 


Since the slack variable s; has already give 
nents of c, and b are non-negative, we add 
x, and get the following Phase-I problem 


n one identity column, and all the compo- 
only two artificial variables, namely Xa, and 


Max —Xay = Xap 
subject to 


x 
X2 
a Pg 9 Aj 1 
—2 4 1 0 w O e 0 by a | (7.30) 
2) Teens 08 O Moser |e 
Xa 
S1 
l f X1, X2, Ai, H1, H2, Xais Xar, S1 > 0, 
E and 
4 À1S1 = 41X1 = U22 = 0. 









The starting tableau for the above is 





Had it been the usual simplex method, we 
a non-basic variable in the next iterations. 


-y 


ai A BLF Hix) = H2X2 =0. 


- 4 i 

otinrally satisfied because Ay, H1, H2, %1 
MILC tically > e pees ae K l ' x t 

i ye Ç : ; + aT a ple then $4 can no 

a TE vd jgan Á } me : w = 

j dik i 7 Lap 
> don tT walt. 
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à 1G 
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Max 2 = 2X1 + 3x2 — 3x? + 2x4x5 — 942 
subject to 2 
-x1 -X2 am] 
3x1 + 4x3 < 12 
X1; x22] 0. 


Now ¢ = (2, D et = 1, %)?,b = (—1,12)" and 


-1 -1 = 
E peep 3) i] 
3 4 | D -| 1 A 
We can also check that D is negative definite and the given QPP can be solved by 
Wolfe’s method. The KKT system to be solved is 


X1 

X2 
O2 l 3 ET OOO A 2 
aoe ee 1) 4 O Le SEO OND ee [3 
Sel Ov OT OF MAO RO ery lt. 1 
OA] oO! 0 OO RO maa 12 

S1 

S2 


Xj, Ài, lj, Si >0) -Ga1257= 12) 
hs, = 0 "C= 2) 


and 


Since the third component in the R.HS is negative, we multiply the third equation by 
4 and then add three artificial variables, namely Xa, Xa, and Xa, to get the following 


= Phase-I problem: 
i k Mia's - 
a ir: = Max I 5 Xp, = m Xas 
e Subject to 
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Xj, Ai, pj,5;20 (i= 1,2; 7 =1,2) 
Ass; = () (i = 1,2) 
and 
u;x;=0 (j=1,2). 
i T gai 

E x = (xı, x2), A s (A;,A2)? and H = (1, u2)". The problem can now be solved ty, 
. — entry simplex method in a manner similar to Example (7.7.2). We Can 
verify at (x; = 7/10. x, = 11/10). is the optimal solution of the given OPP and mi 
optimal value is 47/20. » 


saga 7.7.2 (Convergence Theorem). 

or ` : r g S s 

aan yanen programming problem (7.23), let either (i) D is negative definite ør 
4 e semi-definite and c=0. Then, in the absence of degeneracy, Wolfe’s 

method always converges in finite number of iterations. 4 


point in finite number of iterations. Al EA 
; . Although hal] jke i 
verify the same through examples only. e shat not prove this result, we shall 


subject to 
2% +2 <1 
S 3 >0. 


rr, ene = OO A Ib = (1) and D = which is negati 
Ñ 1 1t IS Ive 
semi-definite. As c = 0, So we expect the algorithm converge. 


ii For the given problem the KKT conditions are: 





A1 
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Xir X2, À1, U1, bp, 81 > 0, 


M81 = 11% = uam = 0, 
the first restricted entry simplex tableau is 






Now 





Here if we have to enter A, (as it is the only variable for which cost coefficient is negative) 
then in the next tableau A; and sı both will become basic variables (note that sı can not 
eave the basis because the corresponding yj; value is zero). So as per the convergence 
theorem, all Xa; Should be zero and that is happening here. Therefore an optimal solution 
of the given QPP is x} = 0, x, = 0 as (x) = 0,x5 =0,A; =0,u5 =0,p5 = 0,5, =I) isa 
solution of KKT system. $ > ie 

The convergence essentially follows because if we force to enter A; then it is entering 
here at a ‘zero’ level, i.e in the next tableau, though it will be a basic variable but its 
value will be zero. Thus no complementary slackness conditions will be violated. The 
next example suggests that this may not happen if D is negative semi-definite and c # 0. 


Example 7.7.5 Use Wolfe’s method to solve the following (QRP) 


Maz  z=2% + X2 — x2 + 2x1X2 — X3 


subject to 
2x, +X2 <1 


xi a; 
fe 
Solution Here, c = (2, 1)7,A = [2 1],b = (1) and D = 1 -1 


ponding quadratic form equals —(x1 — X2)? which is 
and it is zero for x1 = %2 + 0. Therefore the initial 


} The matrix D is 


negative semi-definite as the corres 
less than or equal to zero for x € R? 
tableau of Wolfe’s method is 








Xan S1 
0 0 
1 0 
0 1 
0 0 


i STF A i : 
f eg «To T f ` |] 4 4 
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the algorithm and it fails to give an optimal solution of the given QPP, In this COntey, 
it may be noted that the given QPP certainly has an optimal solution as the feasit) 

i i i i i . 2 FA ` p 
region is a polytope and the objective function is a continuous function of Xı and % 
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7.8 Summary and Additional Notes 


e This chapter presents an elementary introduction of the convex /cancave functions 
and the convex optimization problem, 

e Section 7.2 discusses various characterizations and properties of the convex function 
which are relevant to the study of finite dimensional optimization problems, 

e Sections 7.4 and 7.5 are devoted to the study of convex programming problems 
The main results presented here are the KKT necessary /sufficient optimality con- 
ditions which are motivated by the well known theorems of calculus for minimiz- 
ing/maximizing a function of n variables, 

e This chapter also discusses the quadratic programming problem (QPP) where we 
maximize a concave quadratic form sub ject to linear constraints. 

e The main method for solving QPP’s, namely Wolfe’s method, is presented in Section 
7.7. It is shown that this method is convergent of either (1) D is negative definite or 
D is negative semidefinite and c = 0. 

e Jensen is generally credited for introducing convex functions in 1905, though Hadamard 
in 1893 and Hélder in 1889 also had related work on this topic. 

e Some of the standard references on convex sets and convex functions are Rockafellar 
[136]. Roberts and Varberg [131], Stoer and Witzgall [150] and Borwein and Lewis 
[26]. Most of the standard texts on nonlinear programming also have chapters on 
convex functions, e.g. Mangasarian [109], Bazaraa and Shetty [11] and Avriel [6]. 
The book by Boyd and Vandenberghe [27] is a recent addition to the literature 
which gives a complete modern theory and algorithmic development of the convex 
optimization problems. 

e Wolfe’s method is directly based on the KKT system of the given QPP and is solved 
by the restricted entry simplex method. This method was developed by P. Wolfe in 
1959. 

e Another popular method for solving QPP’s is the Beale’s method, developed by E. 
M. L. Beale in 1959. 

e Though the topic of quadratic programming is included in most of texts on nonlinear 

i programming , e.g., Avriel [6], Bazarra and Shetty [11] and Simmons [143], there are 

f certain dedicated books on quadratic programming itself e.g. Boot [23] and Van de 

} Panne [121]. 

Certain structured QPP’s have become very important in the recent past because 

ey are potentially very useful in the areas of machine learning and portfolio opti- 

n particular, if the given QP P has only one linear constraint or has only 


iul 
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box constraints (such problems oce 
we can use some of the 


pai and Fletcher [43, 58). 
Quadratic programming is very close 
em and bi-matrix games. Infact, we 
complementary pivoting algorithm de 













ur ve 


ry naturally į i 
recently de y n se 


aren of l hi + lee i 
velope nachine learning) 


d t Y.. i : 
ind very efficient algorithms due to 


iy satire: « i 
. y related to the linear complementary prob- 


can also solve QPP’s by employing Lemke’s 
veloped by Lemke in 19692. 


7.9 Exercises 


aa t whether each of t 
4 7.1 Tes z of the f ollowing functions ts Convex, concave, or neither conver 
pt nor concave 
| A) = 6x — x* -3, x ER. 
B F(X, %2) = X1%2 + X1 + x2, (x1, x) € RÈ. 
8. tr, %2) = x7 + 32x 1x2 + 2x2 = 10x, — 10x, (x1, 2) ER. 
4. Fixe, X2) = a + 2XyXo + Pose = 5x1 + AX, (x1, X2) e R?. 
n 5. f(x1,%2) — =i F 2X1X2 ar 3X1X3 Sr 6X2X3, (X1,X2, x3) S R$. 


7.2 Let f : R? > R be given by f(x1, x2) = ax? + bx + 2cxıx2 +d. Find values of a,b,c 
and d for which (i) f is convex (ii) f is concave. 


7.3 Let f : R —> R be a convex function. Show that g(y) = f(2—y), y € R, is a concave 
function on R. 


7.4 Let f : R > R be a conver function. Let g(x) = f(f(x)). Under what conditions, 
is the function g convex? Verify your answer for (i) f(x) = x° (ü) f(x) =e™ and (ii) 
f(x) = (1 — x). 


7.5 Let SCR" be a convex set and f : R > R. Show that f is a convex function if and 


k n 
1 k 
only if for any integer k > 2, we have f y. Aix E y Aif(x), for allx™,...,x49 €S 
i=1 i=" 


and for all0 <A < 1 (i=1,2,...,) with Dig Ai = 1- 


Bret a cp) o whore a> Oand 420 C= 12...) BIS 
— Concave function? 


A 4s the function fp 2,-++1Pn) == 


your answer. (Th 





2e piln pi; Pi > 0 (i = 12s me tt) Qa Conver 
is function is the well known entropy func- 


aie: — Sn i - Jr ov, al 
qve Treason Jol 
LUO FOWMIUIY FU! 


a Á y d Í E = = 
eR fori=1,2,..-,m (such functions are 
Hany nphere cp © R JONE ree e WO” 


~ P axe f. od he. sop V A \\ > ) * : — } 5 g f ) 
T onrh £as a CONVEL Jun chon of Xi (i E E 2, eM 
PORGI U Wiper. SOS Aa. a rf See is 
the se also true? Give reason for 
an A a A ete — (eh a z d 
Ee 


, panne T: l 
Te the CONVET oS 
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ne ee (0 1] 

7.9 Let f : [0,2] > R be given by f(®)=)o-xx€ (1, 2]. | 
Show that f is a concave function. Let g: KO R be a gwen by g&(x)=min (f(x), x2), 


Is g also a concave function? 
nd f:S > BR be continuous. Then f is said to be q 


(1) 4-x(2) 1 
D xO ES, we have f(z) $ 20 (x) + FG?) 
if and only if it is a conver function. (Note 


7.10 Let SCR" be a conver set a 
midpoint convex function if for all x 
Show that f is midpoint conver function 
that the continuity of f is crucial here). 
drive the famous arithmetic geometric mean 


7.11 Using the definition of convexity, 
positive numbers and A; are 


inequality yi, Aili 2 Maaa where a; are the given 
arbitrary non-negative weights satisfying ey At = be 


7.12 Are the following convex programming problems? Give reasons for your answer 


(1) Max In(1 + x1) + x2 
subject to, 


2x, +x% <3 
X1, X2 > 0. 
(2) Maz X1 — 2x2 


subject to, 
Max(0, x1) — x2 < 0 


+xe <4. 
(3) Min ne + Xo 
subject to, 
lx] + |x2| < 2 
i, = Pa « 0, 
(4) Min XT T KA 
subject to, 
—xX1X2+1<0 
X1, X2 2 0. 
(S Min lx — 1| + |x — 4| 
subject to, 
Oe as. 
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(6) Min h(x) 


i subject to, 


), x3), A O<x<5 


where P(x) = Max(x2, (x — 2), 4), 


be Q (7) Max 
yy subject to, 
Note tne <1 


4x, = 3x2 





OT 
(%4 = 1)? +32 <1. 


7.13 Let f : [1,3] > R be given by 


O z -1<x<1 


e E AET 


Jer 


Sketch f,Ef,Gf and Ta for æ =1, a =0, and a = -1/2. 


7.14 Let f(x) = xT Qx be a quadratic form in two variables with x = (x1,x2)! and 


OE g i 


421 422 


Let qu > 0 and q11922 — a > 0. Show that the given quadratic form is positive definite 
and hence f is a strictly convex function. What can you say about the nature of the 
quadratic form and the function f if qi1 2 0 and q11922 =q > 0? Give reasons for your 
answer. Extend the result for the quadratic forms of n variables. 


7.15 Use the definition of convexity to show that if X is a random variable such that 
X € domain of f with probability 1, and f is convex, then f(E(X)) > E(f(X)), provided iid 
the expectation exist (this inequality is the famous Jensen's inequality). bie 








king O as the origin, let P(x, y) 


71 late of side 4 meteres. Ta Ut 
6 Let OABC be a square plate jj EEE o TN i 


be any point on the plate. Let the temperature at the por 
ECP) = 2xy — 2x - 2y 3 





t on the plate. Formulate the above as an optimization 


__ Ttis desired to find the hottest poin 
i e desired point. 


Problem and solve the same to get th 
an TET Coe 





: RE a hape with vertices as A: (0,0), B: (1,0) and 
. shopping plaza is trianguiar 1" or” P: (x,y) in the plaza such that the sum 
t> he installed at a P pint F: ( Ue le ai, 
d | UC tltolvuewnr™ ya i aut ae | - oo Pa is eas j 
1 the three corners ¢ » do ee <i 


© 
am a, Ki N 
hai Se ay 
_ 7 
4 
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lem. 


timization prob 
1. Formulate the above as an opium blem can be solved by Wolfe’s method. 


2. It is claimed that the above optimization pro 


Justify this claim. 
3. Determine the point 


7.18 Consider the following QPP 


P by employing a suitable numerical optimization technique. 


Min (xi = x2)? + X2 
subject to 
—x1 + X2 Š 0 
xı + 2X2 < 3 
wir A Z 0. 


1. Express the objective in the standard QPP form clx + x! Dx. Is D positive definite? 
2. Solve the above QPP by Wolfe’s method and identify difficulties if you face any. 


7.19 Use Wolfe’s method to find a point P : (x,y) in the co-ordinate plane which lies 
on the line x + 2y = 4 and is nearest to the origin. 


7.20 For a QPP in the form ‘Maz c!x+x'Dx subject to Ax < b,x > 0’, we have the 
following KKT conditions. 


X1 

X2 
eee kD | Ome OA 2 
e et 4 () =! 0-0] wu 3 
een 0 Ome Ol! a | | —T 
Pee 20 Os CO” 20) T I a 12 

W) 

w2 


X1, X2, U1, U2, V1, V2, W1, W2 > 0, 
V1X1 = VX? = uU1w1 = W2u2 = om 
1. Write the (QPP) being solved. 
2. Perform one complete iteration of Wolfe’s Method to solve the (QPP) as obtained 
at (1) above. 


T2 mE et § a i 2 ; 
ea eu x2) E€ R a + {eas Mi and Sı = S U {(2,2)}. Consider the 


É d EA 
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ess the above as a standard OPP. 
A Solve the QPP so obtained graphically, 
9, Write the KKT | 
KKT theorem. 


system and justif 
Justify your answer, as obtained at (2) above, by using 
“9 
7.22 Given two points P : (xj, Yi) and Qs { 
gre three standard metrices which are 
and Q. These are 


X2, Y2) in the co-ordinate plane R?, there 
used for defining the distance between points P 


a1 (P, Q) = |xy = x9] + Yi — y| 


2(P,Q) = /(x1 = x)? + (y1 — Yo)* 
doo(P, Q) = max(|x1 = xl, |y - y2l) 


Let there be four facilities located at points (1,2), (-2,4), (2,6) and (-6,3). A new facility 
is to be located at a point P : (x,y) so that the sum of its distance from the four existing 
facilities is minimum. Formulate the above optimization problem when the distance is 
taken in the sense dj, d2 and dæ respectively. 


Extend the definitions of dı, dz and dœ for points P and Q in R". 
7.23 Solve the following QPP by Wolfe’s method 
Min (x —3)?+ (x2 - 3)? 


subject to 


xi +x <2 
xı- xX% <1 
wi a EZO: 


7.24 In a resource allocation problem, let bj(i = 1,2, P ,m) denote the maximum avail- 
ability of the i resource and a; denote the units of i resource used in producing one 
unit of the j} product,(i = 1,2,..-,™; j= Wyle im) Let per unit cost of ie for 
the f} product be proportional to the units of the J product produced with the constant 
of proportionality as cj (j = 1,2,. yf). 

_ Formulate the above as an optimization 


F aan 
y : 
JPP 
Wia A 
v =~ 6 
f . 


problem and identify if its is a LPP or a 
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Optimality Conditions and Duality in Nonlinear 


Rn a. 


8.1 Introduction 


Taking motivation from the classical calculus methods to optimize a function of one or 
more real variables, in Chapter 7, we have already given an elementary introduction of 
the Karush-Kuhn-Tucker(KKT) optimality conditions for a general nonlinear program- 
ming problem. In this chapter we essentially try to be bit more formal mathematically 
and give a formal proof of the KKT necessary/sufficient optimality conditions, and use 
the same to construct the standard Wolfe dual of the given problem. In this context 
we shall like to remark that this aspect of nonlinear programming, if done properly, is 
mathematically very involved as it uses many advanced tools of convex analysis and gen- 
eralized gradients. In our presentation here, we do not plan to go into these details and 
keep ourselves at a some what lower level in terms of using mathematical tools by as- 
suming that the objective and constraint functions of the given nonlinear programming 
problem are continuously differentiable. 


8.2 Feasible Directions and Linearizing Cone 


Consider the nonlinear programming problem 


Min f(x) 
subject to 
gi(x) < 0, (i = E Dae ,M) : (8.1) 


H. j ‘on S of problem (8.1) is contained in 
aS aa va = ca he ; m), are defined and continuously 
on of R” where f and 81 = 4ni 


a 










c Ma w = =~ 
SOME nanan crihoc ; 
AAL Vpell stl Q 3 FN 
R E Vee A or 
4 pu 
= . J 
- | A = 
‘ iy ) »¥ g a) i 
| " Yt oe 


= t“ 


E Global Min Point). A point x € S is called 
E rg problem (8.1) if 15 > 0 such that 


SeT 
ê 
e # 


=: 


+ 
T j 
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300 Numerical Optimization with Applications 
if the above inequality holds for all x € S, then x is called a global min point of th 


given noniinear programming problem. 


Definition 8.2.2 (Feasible Direction). Let x be a feasible ee l. 4 A ES. We defing 
a feasible direction at x to be any direction d (i.e a vector d E€ R") with the property tha 


x+ad is in the feasible set S for some a sufficiently small, i.e. a direction d € Rn i 
feasible at x € S if I g > 0 such thatx+a deS V0<aso. 


We shall denote by D(x) the set of all feasible directions at x, i.e. 
D(x) ={deR":da>0 3 0<asosxtades}. 


Thus a direction d is in D(x) if we can move a small distance from x in the dra of 
d and still remain in the feasible set. Obviously if x € Int S then D(x) = R”. In Fig 87 





Fig. 8.1. 


arrows indicate the feasible directions at the given point. 

Our aim here is to establish certain necessary/sufficient optimality conditions for a 
point x to be a local min/ global min point of the given nonlinear programming problem 
(8.1). In this development, the set D(x) as introduced above will play a major role. We 
shall be presenting results for the minimization case, the maximization case will follow 
analogously. 


Lemma 8.2.1 Let xX be a local min point of problem (8.1 T 
d € D(@). fp (8.1). Then d™ Vf@) 20 for all 


-= r 
J >, N ‘ 4 a 4 ao ey Per | ewe 
: FOO]. = T i JO S51 D. Cs ie oe U 


“et dd E D(x) with d) Vf@) < 0. Noting that dT Vf@) is the 


f the function f at X in the dire oti f is j i 


a Pk 


at. oe 
7 A aap 
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F(X + O di) = f 
et <0 


ver -Ô < 0 < O, UO + Q. 
Therefore, in particular, J ô > 0 such that 


fE +O d)-fR) <0, VO< 6 <5. 


i.e: JO POUN fay yo 2 @ <5. 


But this contradicts that x is a local min point of (8.1). o 


Corollary 8.2.1 Let x be a local min point of (8.1). Then d'V f(x) = 0 for alld € DE). 
Here D(X) denotes the closure of the set D(x). 


Proof. Any direction d € D(X) may be expressed as the limit of directions d“ of DŒ), 


1,€. ue 
d € D(®) = I {dP}, d® € D@) such that d = lim da”. 


But by Lemma 8.2.1, (d“)"V f(x) = 0 for all dV € D(x) and hence 


lim (aT Vf) 20 


d' Vf) = 0. 
which proves the corollary. o 


Corollary 8.2.2 In case the local min point x of the nonlinear pregame problem 
(8.1) is in the interior of S then D(x) = R”. Hence Lemma 8.2.1 gwes d Vf) = 
0 Yde R”, which implies that V f(x) = 0, the well known first order necessary optimality 
condition. 


Definition 8.2.3 (Usable Direction). Let x € S. A direction d € R” = said to be a 
usable direction at x if I o > 0 such that f(x +a d) < f(x) for all 0 <a <o. 


we 


hus a small movement from x along a usable direction d strictly improves the value of 


| Sa 


Mosia "aia 
ll 








1. Feasible Direction). A direction d which is both feasible 
ilo tec } le direct IN atx. 
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302 Numerical Optimization with Appligations 
f. We have to show that d = -Vf() is usable at £. For this, it is enough to shoy, 
Proof. We have 
that (V(2))'d < 0. But 
VEET d= - (VEEN VF) 
= — ||VF@IF <0 





So if d is feasible, it is also usable feasible. 5 


l l sian eo ae 
Feasible direction vectors are important in many numerical eres gorithms a, 
they suggest a direction of movement from the current point x`”. 


Definition 8.2.5 (Active Constraints). Let £ € S. A constraint is called active at + 
if it holds as an equation at ĉ. 

Let I = {i = 1,2,...,m} and (8) = {i €L: g(x) =O}. Then 17(X) is the collection of al] 
active constraints at £. Let In(%) = I \ (£). Therefore for i ¢ (£), gi(£) < 0 and hence 
by using the continuity of gi (for the specific 1 ¢ (£)), we may move a small distance 
in any direction from ĉ without violating that constraint. This gives D, (£) = R”. But 
D(x) = Di (£) A D, (£) = D, (2) A R” = D; (£). Therefore inactive constraints do not 
contribute to the set D(£) and hence only active constraints at £ need to be considered. 
Definition 8.2.6 (Linearizing Cone). Let £ €S Then the set D(£) given by DE = 
AER d'Vgi(2) <0, ie I;(X)} is called the linearizing cone of S at £. 

The set D(£) is appropriately named as the linearizing cone because it is generated 
by linearizing the active constraints at x, and d € D(f), a > 0, implies that ad € D(2). 


Theorem 8.2.1 Let €S. Then DZ) eae): 


Proof. Let d € D(%) and g(t) =O tee I,(%). If possible let d'V ¢;(2) > 0 for some 
HENA) Then this will mean that 4 6 > 0 such that gil + Od) > g(x), 0<A<6 , But 
8i(%) = 0 and hence 


Sit +d) >0, 0< 9 <6. 


which implies that (ĉ + Od) ¢ S. 
Hence d ¢ D(£). Therefore for all d € D(&), 


d' Vg) <0, Vieno. 


Thus 
d e D(£) > d'Vgi(3) <0, je h(8). 
Hence D(f) c D(z). o 
Corollary 8.2.3 Let fe S. Then D(2) Cc D(z). | 
o SO ny 1 = ea dais T lal à a d 
-} (Ylica closed Sat and ae ET A W a i T) 
. = o 


eae ae 


£) may not be equal to D(2). 
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Wi 81%) =L -k< 0 
Poe 82(x) = i + Xo < 0, 


P ait? (0, 0)' € S be the given feasible point. Show that D(&) + DR) 
golution First we note that both constraints are active at (0,0)! 


372 
Vg1(x) ani tang 
y ln = i loo = (4 











and 


2(x = se _ (9 
a (0,0) (0,0) u 
Therefore 
D(X) = d E€ R2 s dV g(x) < 0, i= i 2) 
= \(d, d2) : -d2 < 0, d2 < 0) 
aa (dı, d2) Moby = (0, 
| = {(41,0) : dye R}. 
— gw 2 g (x)=0 
Feasible PEN R X= (0,0 
M 3 
g(x)=0 R g(x)=0 


Fig. 8.2. 


fy > j © 
‘ow we proceed to obtain D(z). As 2 + Od = (Ody, Qd>)', we have by the definition 


of Dia, ï 

¢ | . 

eit 9d, <0, 0° + Od: <0. and0<9 <0 
Fay! ie ‘aa i ~ i | 


aL 


gives | d} me Dak 9> 0. 
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8304 Numerical Optimization with Applications 
D() = {(di,da) : dı <0, d2 = 0}, 
which we can also visualize from Fig. 8.2. Therefore D(%) c D(%). But (1,0)' e Dy 


which is not in D() and hence D(%) + D(X). So a direction d of D(X) may point in 
infeasible direction. Here we may note that D(£) + D(£), as D(£) = D(#) and D(%) + Ds) 


Example 8.2.2 Let the constraints be 


§1(X) = -xı < 0 
g(x) = -%2 < 0 
§3(x) = -(1 - x1) + x2 < 0, 


and X=(1,0)' €S be the given feasible point. Show that D(&) + D(R). 


Solution As ¢)(%) + 0, g2(£) = 0 = g3(ĉ2), constraints 82 and 93 are active at £. Now 





| 


D(%) 


_ AV gn(8) <0 


(dı, d2) : -d3 < 0, dy > 0} 
(dı, d2) dz = 0} . 


i Ako 2+ Od = (1+ Ody, Odz)" and therefore 


| 





ey r 









$ 
> > A 
y~ ay 
IEA. 
© 


B 3 . a 2 £ 
mcm ay ya L o. P . 
we 7 4 S -d < 0 gives d, < 0. Further 
‘pa one — 0* gives d) = 0. Therefore 
ae J s > ie “et q e f 
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: sich js also equal to D(2). Obviously, 


| rae, 8 the point (2, 0)” € D(x) but it does not =. inclusion is not 


; ve 8.2.1 The definition of the line we 
P pstraint functions gili € I). But the same fea e DO) depends explicitly on the 


easible set may h 
. hence it i 5 yY have more than one repre- 
n and e Wt ws possible that D(£) # D(X) for one representation of the given 


pasible region but D(X) = D(X) for another re | } | 
Framples 8.2.3 and 8.2.4 illustrate this point. presentation of the same feasible region. 


Example 8.2.3 The two representations (i) x4 
the same feasible region of R*. Let £ = (0, 0)? 


Solution The feasible region is the xz axis and £ is the origin. Now 


D(x) = \(d1,d2) : O + Od, < 0,-@d; <0,0< 0< 6} 
(dı, d2) : dı = 0} = Die) 


= 0,x2 ER and (ii) i = 0,x2 E R give 
. Identify the representation for which 


D(k) = {(di, d2) : (di, d2) a < 0 and (dy, dz) et < 0} 
= {(d1, do) : dy = 0} = DG) 


So D(x) = DE). 
But if we take the second representation for the same feasible region S, i.e. ne = 0; 


CERE <0, 7 <0, x ER, then 
D() = {(d1, d2) : dı = 0} = DQ) 
and 
D(a) = (d d2) A | E < 0 and dda o | oe < 0} 


are” - = l, dz) : dy.0 + d2.0 = 0} = R: 
| region S remains the same, namely the x2- 











0) + D(0,0), although the feasible 
B | = p i l - 3 

E e aD ad (ti) (1-41 — x2) 20,x%1 20, 
22 = te Verify that both sets of constraints give the 
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306 Numerical Optimization with Applications 
Solution It is simple to verify that the feasible region given by both sets of Constraint, 3 
the same solid triangle with vertices as (0,0), (1,0) and (0,1). Now for the representation 
given by the first set of constraints 


D(1/2, 1/2) = {(di, d2) : dı + dz < 0}, 


(dı, d2) : (1/2 + Od)) + (1/2 + dz) < 1,—(1/2 + Od;) < 0, 
—(1/2 + Od) < 0,0 < 0 < ô] 
= į(dı, d2) : O(dı + d2) < 0,0 < 0 < ô) 

E :dı +d < 0} = D(1/2,1/2) . 


D(1/2, 1/2) = “pe -24+0dEeS0<0< ô) 


Therefore, D(1/2, 1/2) = D(1/2, 1/2). 
However for the second representation, i.e. for the second set of constraints D(1 /2,1 /2) 
D(1/2, 1/2). 


8.3 Basic Constraint Qualification 


aa E define the basic constraint qualification for nonlinear programming problem 
8.1). 


Definition 8.3.1 (Basic Constraint Qualification). Given the non-linear program- 
ming problem (8.1), we say that the constraints satisfy the basic constraint qualification 
at a feasible point 2 if D (£) = DE): 


Lemma 83.4 Let (i) x be a local min point of the given nonlinear programming problem 
and (ti) the basic constraint qualification holds at X, ie D() = D(x). Then 


a'Vf®)>0, Yde D). 


The proof follows from Corollary 8.2.1 simply by replacing DE) by D(x). 
We now introduce two more sets Zı(£) and Z, (£), for any feasible point %, as follows 


Z1(8) = fd ER" : dTV f(g < 0}, 





7 à 
dis ie | P ma 
ee 8 


Bie ley cy = 
R 
$) < 0,i € h(8), dTV F(z) < 0}. 
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3,2 Let X be a local min point of the given nonlinear programming problem 


oe i oe) = = D(X). Then Z2(X) = Q. 
J k 8.3.1 Here it may be remarked that if for a feasible point x, D(x) = D(X) and 
Re! , then it does not necessarily imply that X is a local min point of problem (8.1). 


10 mpl g 3,1 given below illustrates this point. 


le 8.3.1 Consider the problem 
Min -%2 
subject to 


x2 +x S4 


ax 4x2 O. 


a 


x D x K in point of 
tet x = (0, 0)?. Show that Z2(x) = $, D(x) = D(x), but X is a not a local min po 


iven problem. 
force For the point x = (0, 0)", we have 


E /((0 _ {0 
0-2). veor-(°), vem) 
and g2 is active at x = (0,0)". Therefore 


D(0,0) = {(d1, d2) : —0?d? + Od < 0,0°(d; + d2)<4,0<0< 6} 


4 D(x) = {(d1, da) : dz < od, 0<0< 6} = = D(0,0) . 


(0,0) = {(di, d2) : d'Vg2®) < 0) 


= {(d1,d2) : dz < 0}. 
Therefore D(0, 0) ¢ D(0,0) = D(0,0). As for feasible x, D(x) c D(x) is always true, we 
re , 
e D(0,0) = D(0,0). 
Fu E ner 


w a 













(0,0) = RA = {(dı, d2) : : dV g(x) <0, d'V f(x) < o) 
i aa) d2 < < 0,-d2 <0 
i(dı,d; ie a2 Ei 0,d2 > 0j 


T. pet a 
= es 


on 








_ 
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Fig. 8.4. 


8.4 Lagrangian and Lagrange Multipliers 


Definition 8.4.1 (Lagrangian or Lagrange Function). Given the nonlinear pro- 
gramming problem (8.1), the function L : R” x R” —> R given by L(x,A) = f(x) + 
ye Aigi(x) is called its Lagrangian or Lagrange function. Further the vector A = 
(Ay,A2,..-,Am)! is called the vector of Lagrange multipliers. 


The following theorem gives the necessary and sufficient conditions for the existence 
of Lagrange multipliers at a feasible point 2%. 


Theorem 8.4.1 (Existence of Lagrange Multipliers). 
Let S be the feasible region of the given nonlinear programming problem (8.1). Let £ € S. 
Then Z2(£) = if and only if there exists a vector of Lagrange multipliers A € R”, such 
that 
(i) Vf(£) + LAV 8i(2) =0 
1E 
(i) g(%) <0, 1E1 
(iii) Aig(2) = 0, i€1 
(iv) A; >0, iel. 


Remark 8.4.1 The above system (i)-(iv) is called the KKT system corresponding to 
a given feasible point £. If, for a given point  € S, there exists A; (i € I) such that 
($ Atja: e E satisfies the KKT system then we say that at the point &, the KKT condi- 
tions hold or x is a KK T point. Thus Theorem 8.4.1 gives the necessary and sufficient 
condition for £ to be a KKT point. 


Sad 





ise of the following lemma called the Farkas’ 


j km 
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g scalars Be20 (kK=1,2,.. wha: a, 


PP he, 
KER that AN" be (7 + 1) vectors in R”. Then 


r 


q) = È Bra, 


k=] 


>a VSTN 


if and only if y'a > 0 for all Y sati 
A a, aO] and 
states that 


stying y'a® > 0, (k= PO 
T = (y ER": yTA > 0) 


YET= ya >0) 4 BeR 
OE Ala —& wr 
ome = A B= Ye ba™, BER". Then 


i 


, then the Farkas’ Lemma 


such that a®) = A'B. 


r 
(0 
ya" = X pry a > 0. 
k=] 


T 
Conversely let y a) > 0 for all y € T. We wish to show that a is a non-negative 
linear combination of vectors a“), (k= 1,2... ,r). For this, consider the LPP 


Min yT ©) 
subject to 
yi gh) ek = 12,7) > (8.3) 


and note that the linear programming problem (8.3) has an optimal solution y = 0 as 
for all y € T, yTa® > 0. Therefore from the duality theorem of linear programming, the 
dual of the above problem (8.3) is feasible and has a finite optimal solution. But the 


dual of (8.3) is 










Max 0'8 


subject to 
r 
a =) pra” 
k=l 


Bx 2 9, (k= 1,2,.-2,1)- (8.4) 

- r 
T 

iga diy i As mentionec earlier, we shall make use of 





i + Bare pA as 
E ` i | 7 D A 
=) pan 
Fa = 
. ~ be r: 
f <a 
í e, d 
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310 Numerical Optimization with Applications 
Proof of the Theorem 8.4.1. aig 
We first note that D(£) # p because 0 € D(k). Therefore Z2(k) = ó if and o nly if 
Wd € D(X) we have 
i dTV F(R) = 0, 


i.e, 
d'Vei(&) < 0, i € h(k) > d'V (2) = 0. 
If we now take â = Vf() and a = -Vg;(%), i € I(), then the Farkas’ Lemma becomes 
applicable and therefore there exist multipliers A; > 0, (i € 1(&)) such that | 
VF = Y At-Vgi(2)) 


iel (2) 


VFR) + Y AVgi(2) = 0. (8.5) 


iel (£) 
Now taking A; = 0 for i ¢ I, (£) we have from (8.5) 
VF) + Y Aivgi(t) =0. 
1€] 
Also 
gi(t) <0, GED, A;>0, GEN. 

Further the relation AV gi(8) = 0, (i € I) follows because for i € I (£), 9;(%) = 0 and 

for i ¢ lı (£), as per choice, A; = 0. This proves the main theorem. o 


8.5 Karush-Kuhn-Tucker Necessary /Sufficient Optimality Conditions 


We now state the most basic result in nonlinear programming, namely the Karush- 
Kuhn-Tucker (KKT) Theorem. As most of the mathematical concepts needed to prove 
this theorem have already been presented in earlier sections, the proof will follow simply 


by recalling some of the earlier results. 


Theorem 8.5.1 (Necessary Part of the KKT Theorem). 

Let x be a local min point of the given nonlinear programming problem (8.1) at which the 
basic constraint qualification holds, i.e D(x) = D(x). Then there exist multipliers (called 
KKT multipliers) A; (i € I) such that the following KKT conditions hold 

(i) V(x) + LA: Vg: =0 










t 
i fee fae au Rp Y S j] ~ ug ALETE s OT * 
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4 From the given hypothesis we | 


infer from Le Q s ae 
pa follows directly by employing Theorem $4 EA 8.3.2 that Z4(%) = . The proof 


a 
8.5.2 (Sufficient Part of the KKT The 


a — vd irs pe orem), 
jet Ate -++ Am) satisfy the KKT conditions (i )-(iv ) of the Theorem 8.5.1 oie oe 

and gi (i € I) be differentiable convex functior Then 7 O, ye. Le 
f i ; tons. Lhen X is a global min point of the 
given nonlinear programming problem (8.1), 


i We have to Bees that x is a global min point of problem (8.1), ie. F(X) < f(x) 
for all feasible x. For this let x be any feasible point. Then by the convexity of f we have 


f(x) -fz e-DA. (8.6) 
Now substituting for V f(x) from the KKT condition (i), the above inequality gives 


f(x) — f(x) 2 FRG - X)" Vgi(X)) . (8.7) 
E 
But g; (i € I) is a convex function and hence 
gi(x) — gi(X) = (x - x)" Vg), 
j.e. 
-(x — X) Vgi(x) = gilx) - gi). (8.8) 
As A; > 0 (i € I), (8.7) and (8.8) gives 


m m 
A A 8.9 
fo- fO = -Y Aisi@) - $ Aisi (8.9) 
i=1 I= 
i i feasibility of 
Also from the KKT condition (iii) we have Ae) = 0, GE. Forther Oa 
x gives g;(x) < 0 (icl. T herefore from (8.9) we get 


f(x) — FO) 2 -Y igi) > 0, 
re 








m 


PE OETA 0, 


j o a « a 
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Example 8.5.1 Consider the nonlinear programming problem 
Min m. 

subject to 





x] +4 
=X, a E 0 Pa 
Verify that the KKT conditions are satisfied at (0,0) but it is not a global (not even 
a local) min point. | s 
p 


Solution At = (0,0) € S (feasible region), Z2(£) = @, the KKT conditions are certainly © 
satisfied. This can also be verified by taking 2 = (0,0), A1 = 0, Az = 1 in the KKT system 7 
for the given problem. 

But (0,0) is not a local min point as can be seen from Fig 8.5 because the objective 

function is to minimize —x) i.e. to max X2, so the optimal points are P and Q as shown 

in Fig 8.5. 





Fig. 8.5. 













Remark 8.5.2 In proving Theorem 8.5.1, the important thing is to guarantee that at a 
local min point x, Z2(x) = $. In our presentation here, we have used the basic constraint 
qualification D(x) = D(x) for this purpose. But obviously there could be other ‘suitable’ 
constraint qualifications giving Z2(X) = Ð. Though the literature contains numerous such 
constraint qualifications, we discuss only the linear independence constraint qualification 








below because of the fact that it can possibly be verified easily in comparison to other k 
isina mugen qualifications, Nevertheless, the topic of constraint qualifications is important | i 
and we may refer to Mangasarian [109] for further details in this regard. à 


} f Dr ‘al T a Ba r Ta DOS, D ~ ; SE TA D S 37 Sie TY E T. 
Cal ndepender ce Constra 6 Oalificati 
y = to Se © VOlstraint Qu 3 Fe fication 


DOlnt of the nonlinear nraocr- ~~ 
Lu hee A GK em (8.1) and I(x) be the 
RAOUL ALLL = A LCL WV S A hA UL U A d { l UU r independence constraint 
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E: k 8.5.3 One certainly needs Some consti 

a TATNA © CONStraint qualificati 
pee at @ local min ve x, Z9(X) . È. Otherwise. the EET conten 
a optimal point as ts illustrated in Example 8.5.9 given below | 


gxample 8.5.2 Consider the 
Min =X, 
subject to 
(1-2) +2 < 0 
-~X SO 
=X S U. 


so as to guarantee 
ons may not hold at 





nonlinear Programming problem 


and verify that (1 0) ts optimal. Does basic constraint qualification hold at (1,0)? Are 
KKT conditions satisfied at (1,0)? 


For the above problem x = (1,0) is an optimal solution (see Fig 8.6) but the KKT 
conditions are not holding at (1,0). We may also check that D@) # D(X). Had this basic 
constraint qualification or some other ‘suitable’ constraint qualification been holding at 
Z, then optimality of x would have given Z2(¥) = @ and hence KKT conditions would 
have certainly been satisfied. 





(x) + p. This can be verified because 
D(x) = {(di, da) : 42 = 0} = {(d1,0) : dı € RI, 
ENET VF@ <9 

bi fez d») ‘- dy ot Op 


(i 


= However, here we have Z2 
Ban r 






EIG) '— Y 


4a 
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8.6 Duality in Nonlinear Programming 


In this section we present duality in nonlinear programming via the Lagrangian an 


tå 1 l J Fee wp V V tions OVe!I \ C | 


the following convex programming problem (CPP) 


Min f(x) 
subject to 
gi (Xx) < 0, 1 € I = "p 2, eons ,m} (8.10) 


and associate the Lagrange function L given by 





L(x, A) = f(x) + X Aigi) = fe) + AP 8), | 
i=1 


| 
where VER A € R? and Q(x) = (91(%),.--, Gale). | 
To motivate the construction of the dual for the convex programming problem (8.10), 
we introduce the following two functions | 
L*(x) = Max L(x, A) 
A>0 
and 
EC) = Min LO AY 
and call them as the primal function and the dual function respectively. Now the func- 
tion L*(x) can be written explicitly as 
BO = Mar L 
“=M Be LA) i 
= Max (f(x) + A” g(x) 


= 70) if g(x) <0 G@=1,2,...,m), 


+00, otherwise . 





: Therefore the problem 


Min L*(2) 







. 
Weer. i 
p paata pa 


VALTE IMIAX LIX. AN- (8 11) 
A T TEN ee . . 
‘A= I ae j E ’ A îs 
r ha AA eS e 
AR A ® P 
we E = DSN A z j 4 i 
‘al now to construct the other problem 
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Max Min L(x, A) (8.12) 










and 808 if there is any meaningful connection between these two problems. In the 
following we shall sie ig under the assumption of convexity the two problems 
Min Max L(x, A)’ and Max Min L(x,A) are really related exactly the same way 
‘the standard linear programming and its dual are related, and could be taken as the 
dard primal-dual pair of the convex programming problems. 
The above discussion suggests that the dual (8.12) for problem (8.11) or equivalently 
the given problem (8.10) could be defined as 


Max L.(A), 
i.e. Max Min L(x, A) 
] Ji 

ie. Max Min (f(x) + Agw), 
i.e. 

Max f(u) + A’ glu) 

subject to 

f(u) + AT g(u) = Min (f(x) + A g(%)) 
A> 0. (8.13) 


Remark 8.6.1 If we wish to be more precise mathematically then in the above discus- 
sion, ‘max’ and ‘min’ should be replaced by ‘sup’ and ‘anf’ respectively. 


j j functions. 
‘on that f and g; (i € I) are differentiable convex 
a ante & YC a) is also convex in x for all fixed A 2 0. 


m 
This implies that L(x,A) = f(x) + ie Nigilx) | p | 
Therefore VyL(x A)| Gh = 0 if and only if L(x, A) is the minimum value of L(x, A), i.e. 
: 3) 


LŒ, A) = Min [f(x) + Y Aigi. 


j=l 


Hence the dual (8.13) can be rewritten as 


g 





~ J 







m 
— f(uyt+ ) Aisi) 


att. 7 J 
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This dual (8.14) is called the Wolfe Dual of the given convex programming prop, 
lem (8.10). For the sake of convenience, in the following we shall refer to the Convey, 
programming problem (8.10) as (CP) and its dual (8.14) as (CD). 


Theorem 8.6.1 (Weak Duality Theorem). Let x be feasible for (CP) and (u, A) be 
feasible for (CD). Then 


F(x) = fu) + Yi Aigi@d . 
i=l 


po 
j a 
Proof. By the convexity of f we have (Bi 4 
O- Aae DTV fG) ag = 
=(x-u)’ [-) A;V9;(u)] (8.15) | _ stra 
3 to ( 
= —)1Ai((x- u)Vgi(u)) (8.16) Re: 
i=1 _ stra 
m m m ratl 
> ) Aigi) — $ Aigi) > $ Aigi) (8.17) | able 
=A = = 4 /10: 
Therefore f(x) > f(u) + D7 ,Aigi(u). | a 


Here (8.16) uses the convexity of g;(i € I) and the other inequalities follow because of = 
the feasibility of x and (u, A) for (CP) and (CD) respectively. ; 


Corollary 8.6.1 Let be feasible for (CP) and (u, A) be 
fŒ) = fu) + X Agi). Then X is optimal for 






feasible for (CD). Further, let 
(CP) and (u, À) is optimal for (CD). 
Proof. Let x be feasible to (CP). Then by 
Linrigi() = f(x), which shows that X is o 
that (u, A) is optimal to (CD). 


Theorem 8.6.1, we have f(x) > f(@) + 
ptumal to (CP). Similarly we can prove 





two objective functions coincide. 
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equations (i) and (iv) give that (¥,A) is feasible to (CD). Also (iii) gives f@) = 
| em Ag) and so (x, A) is optimal , 

| fo* Deis p to (CD) by Corollary 8.6.1. 


‘ee 
v 


pow state several converse duality theorems without proofs. 


8.6.3 (Strict Converse Duality Theorem). Let (i) (CP) have an optimal 
on x (it) the basic constraint qualification hold at X (iii) f be strictly conver and 
glie }) be convex functions (iv) (u, À) be optimal to (CD). Then ¥ = 7. 


Theorem 8.6.4 (Hanson’s Converse Duality Theorem). Let f, gi (i € I) be twice 

able convex functions. Let (x,A) be optimal to (CD), at which the basic con- 
straint qualification holds. Let the Hessian V? L(x, A) be non-singular. Then X is optimal 
to (CP) and the two optimal values are equal. 


Remark 8.6.2 In stating various duality theorems we have made use of the basic con- 
straint qualifications defined earlier (Definition 8.3.1). However in view of Remark mee 
rather than taking the basic constraint qualification, we can as well take some other suit- 
i able’ constraint qualification, e.g. Kuhn-Tucker constraint qualification (Mangasaran 


[109]). 
7 8.7 Certain Special Cases of Wolfe Dual 


(i) Linear Programming Duality | 
Consider the linear programming problem 


Max cx 
subject to 
Ax <b 


aie iow we consider this LPP in tmin’ form and write its Wale dual. So the given LPP 


1S Buie 


k 
a 


; 
<i 





UR- 
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L(x, A) = =c"x + AT(Ax = b) = ux, 


is Vel (x, A) = =c + ATA - u. 
Hence its Wolfe dual is 
Max L(x, A, u) 
subject to 
VxL(x, A, uw) = 0 
Ai 20 
ue 2 Q, 
i.e. 
Max -cx + AT(Ax — b) —y™ x 
subject to 
=c + ATA — u =0 
Ay =G 
H 20, 
i.e. 
Max —bTA 
subject to 
ATA =c4 ik >c 
A=). 


Therefore the dual of the original problem is 


-Max —pT) 
subject to 


ATA >c 
MESO 








8.1 
hee ir SE LPP’ is Obtained in iins T (8.19) constitute the usual primed end 





Quadratic Programming Duality 
pet the problem be 
Max ciga 1 x' Dx 
subject to 
Ax <b 
x 20, (8.20) 
j.e. 
—Min —c!x+ =x'Dx 
subject to 
Ax-b<0 
zx 10: (8.21) 


Here D is assumed to be positive semi-definite so that the objective function of 
(8.20) is a concave function (equivalently the objective function of (8.21) is a convex 
function). We now take the Lagrangian 


LOA) = =c Tx + xT Dx + AT(Ax- b) =y Ty 
kune 
V L(x, À, u) = -c + Dx+ A'A -y 
and then its dual becomes 


—Max UES A, u) 
subject to 


V,L =0 
PN | Ry i 7 À >0 
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i.e. 3 ; 
—Max —b'A + zx Dx o 
subject to oi 
-oT +x1D+ATA -u =0 È g . 
A wt 
p 20, Eee 
l.e. 
-Max —b'A + D a i 
subject to | tr 
-c+ ATA +Dx =p =0 | : 
A>0, | : 
Pag 
i.e. 
-Max -bA + 4x" Dx | Te 
subject to | k 
AA+ Dx >c | ao 
MeD E 
i.e. | ut 
Min a = 5x! Dx | 
subject to | 
ATA + Dr >c 
À >Q. (8.22) 





problems (8.20) and (8.22) constitute the primal- 
problems. In case D = 0, problems (8.20) 
primal-dual pair (8.18) and (8.19). 


dual pair of quadratic programming 
and (8.22) become the linear programming 


> i 3.8 S umn ma any pnd eee otal Notes 


- 


i 7 E 
TIVES a6 forme al M; hem. 
d: pe 


uses H ON 7 Ea diaro 6 st of the Karush-Kuhn- Tucker opti- 
tee a no acus the oe angian dual of the given 


7 a 


Si e K a $ 4 O pt ob ma lity sar AS gs 
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ine „duction of conjugate functions and relate 
her a -Tucker optimality conditi } 
ee E” i‘ iginally d ne pe hes derived by Kuhn-Tucker in 1951. How- 
' „A Karush originally erived these conditions in 1939 using calculus of variations: 
a his work could not get much attention as ail ain 


it was never published. But later when 
me to noti E i , : 
Kart ig work ca ice, the Kuhn Tucker optimality conditions were renamed 


of the same. 


3,9 Exercises 
Bees GE theorem to find the value of B for which (xı = 1,X2 = 2) is optimal to 
the problem 


Mar Z=2x, + Bx2 
subject to 
noes eS 


) xı- X2 <2. 


= Verify your result graphically. 
(a f. $ Hi 7 . | 








nsider LPP 


Maz U = 4x, + 3X2 
subject to 

Xi + X2 <8 
2a + x2 SA 
i x a io 0. | 
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8.3 Consider the optimization problem 





Max G a Xa 
subject to 
| Ixil + [val < 2 
f xi -3 20, 


| 1. Solve the above problem graphically? 

2. Do KKT conditions hold at (0,0)? 

3. Can we conclude that (0,0) is optimal? 

4. Is the given problem a convex programming problem? 


8.4 Consider the NLP 


Min Z = (x, — 4)* + (x2 — 4)? 
subject to 


X1 +X. <4 
HT X1—-X2 S52 
Hl X1,X2 => 0. 


1 
. 
| 
l 


Let ĉ = (x; = 2,x2 = 0) be a feasible point. Determine D(£), D(2), D(£), Zı(£) and 
Z2(%) analytically and show them graphically as well. Do there exist KKT multipliers 
satisfying the KKT conditions at (2,0)? 











8.5 Consider the NLP 


Max Z=In(1+x1) +X 
subject to 
2x, +X. <3 
X1,X2 => 0. 
1. Write the KKT conditions. 
2. Given that the optimal solution of the above NLP lies on the line x» = 3, use the 
KKT conditions to find its optimal solution. 


8.6 Consider the NLP 


Min Z=X+xX 





‘a 
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write the KKT conditions of the 
I. Check if the KKT conditions hold 


above NLP. el: 

| at (x* sa i ta UIN 
= 1,X2 = 1) optimal for the . ee 1). 

f € above NLP? Verify your ansu 

e KKT theorem. 


i. 
$. Is for your answer i ; 
x give reasons y Tr aan the light of th er graphically and 


ider the constraints set x, + x 


2s 4, iS 
= 4). Obtain D(X), and D(X). Doe X2 > 0 and x2 


+%5 < 16. Let £ = (x, = 
s the basi eae np) vel ee 
€ basic constraint qualification holds at the 


3.8 Consider the following NLP 


Min Z= (1-4) +m ~6% 
subject to 

Xo 2 xf 
XxX, <4. 


1, Solve the above NLP graphically. 

9. Write the KKT conditions. 

9 Do the KKT conditions hold at the point (2,4)?. 

J. Can we use the KKT conditions to declare that (2,4) is optimal? Give reasons for 


your answer. 
8.9 Write the Wolfe dual (D) of following NLP (P) 


Min 2 Eat 4)? + (x2 — 4)? 
subject to 
xı + Xo Š 4 
X17 —-%2 <2 
x1,X2 2 0. 


Solve (P) and (D) and hence verify the strong duality theorem. 


8.10 Write the Wolfe dual (D) of following NLP (P) 


Maz Z = 2x, + 402 
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2 2 0, As 2 0, and L(x1,X2, A, A2,A3) = (40, +35 -2i a 
licitly the following problems + 


T “at. 


8.11 Let x1,x2 ER, A; 2 9, A | 
Ay (x1 + X2 = 1) -— À2%1 ~ A3X2. Describe exp 

i d Max | MinL(x1,X2,A1,A2, io) It is clai 
maa) (daa le Ras My i (Ay,A2,Aa) \(%1/%2) med that 


(a2) (AL A2A3 i AAT, 
both of these problems have optimal solutions and equal optimal values. Justify Your 


answer and obtain the common optimal value. 


| 


8.12 Consider the following NLP (P) 


Maz LX) 
subject to 
2 2 
Neh all 


(xı — il) <S Xo) 


1. Write the KKT conditions and check if these are satisfied at (1,0). 

2. Solve the above NLP graphically and hence obtain its optimal solution (X1, X2) 

3. Do there exist KKT multipliers at the point (x1,X2)? Give reasons for pris nen 
4. Is the above a conver programming problem? Give reasons for your answer? | 


ð. Write the Wolfe dual (D) of (P). Can your } : 
equal optimal values. guarane Ragne ) and ( D) will have 
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9 
Unconstrained Optimization Problems 
eee 


9.1 Introduction 


An optimization problem in which the decision vector x is allowed to tak lue 1 

R” is called an unconstrained optimization problem, written in short ns (UMP). ‘This 

chapter 1S devoted to the study of certain standard algorithms for Ga r N ney 
Ver. mization of functions of one variable as well as functions of several variables. Although 
most real life optimization problems are constrained optimization Srobleme the ae 
of unconstrained optimization problems is important, mainly because enai efficient 
techniques for solving constrained optimization problems use our knowledge of solving 
unconstrained optimization problems. 


ave 


9.2 Basic Scheme and Certain Desirable Properties 


We consider the unconstrained minimization problem 


Min f) (9.1) 


and aim to develop algorithms for solving the same. Ideally we shall like to get a global 
min point of (9.1), i.e. to get a point x € R” such that fŒ) < f(x) for all x € R”. But 
unfortunately this seems to be rather difficult for a general function f. Even finding 
a local min point, 1.e. a point x such that there exists a neighbourhood N(x) with 


f(x) < f(x) for all x € Ns(x), is also not always possible. 
In view of the above, we most often decide to identify a set Q c R”, called the solution 


set, such that under certain additional conditions on the nae Ge f (e.g. convexity ) 
any poi al min point of problem (9-1). 
M o aan i dition on the function f. Let us now 


| So far we have not put any smoothness at ‘able. Sometimes we may have also 
= assume that the function f is continuously differentiad'©: 








BA orf is twice continuously differentiable but this condition we shall mention 


ER, 
peEdecr $ 


S q u (ONL AG iw p. > . 
N = = ` E G 
Dpr S7 
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: i 
Definition 9.2.1 (Solution Set). Let f : R 
Then the set 


— R be continuously differentigy 
e, 


Q={xeER": Vf(x)=0} 
ts called the solution set of (9.1). 


Remark 9.2.1 Let f : R” — R be differentiable convex function and x be a Doing of 
the solution set. Then X is a global min point of problem (9.1). 


We now describe a common basic scheme of the form 
xk+1) = ylk) 4 ar di) 


for solving the unconstrained minimization problem (9.1) where x) is the current S0- 
lution, d® is the direction of movement from x and a; > 0 (called the step size) is 
the distance upto which we move in the direction d“) from the current point x). The 
obvious question now is how to find the direction for movement d® and how to deter- 
mine the step size a ? Various algorithms for solving problem (9.1) have been devised 
to determine the direction d™ and the Step size a so that the sequence of iterates {x(4) 
converges to a point x of the solution set Q in an ‘efficient’ manner. 

Before we present any specific algorithm for solving problem (9.1), we list certain 
desirable properties which we ideally expect the given algorithm to possess. 


Definition 9.2.2 (Descent Property). An algorithm 
minimization problem (9.1) is said to have the descent p 
value decreases as we go through the sequence (x) ie. 


FD) < F(x) for all k. 


In other words, for the algorithm to possess the des 
Should decrease as we proceed, 


for solving the unconstrained 
roperty if the objective function 


cent property, the ob jective function 


Ss q $ -R 





li i j ict , A co Vey Ti) A. i . . IA i 
fig AA sath Sala ae or a positive definite 
elore if the aloariet. 
eee © Heorithm behaves well on such a function, 


Ll (TUF ne { 


ae kaar 1 
0 | lis |i Y Y = i 
L NECI T l 

4 PEO s < 

| a b SATCU 
AS Weli 2 

ed Pe AS dhd | 
= 


4. 
oe 
oe 
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- of ‘global convergence’ guarantees 
sper point yO) e R”. Thus no matter from 
» a sequence (x) that converges to a 

wo different starting points will, in 


go that ¢ generate 
er cates but both converging to a point of solution s nerate two different sequences 


that we can Start the 


algorithm from an 
which point we é 


i: Start, we are guaranteed 
Mt of the solution set. Here it must be 
general, 


nerat 


of iter et. 


as è d 
q and et X Ri o : th at pie j k. The quantity ||x® — x|| is called the error of the 
. erate x. uppose that there exists p and 0 <a < œ such that 


x+) — X|| 
koo Oae (0 <a < œ) 


n p is called the order of convergence of the sequence {x}. Thus \\x**) — x|] = 
ae - xP asymptotically. 


p= 1, the sequence {cK} is said to have linear convergence rate and for p = 2, it is 
said to have quadratic convergence rate. In case p = 1 but a = 0, then the sequence {x} 
iş said to have super linear convergence rate. 

The order of convergence or convergence rate is an important concept as it tells 
ns how the ‘tail’ of the sequence {x} behaves. Larger values of p will imply faster 
convergence. Most of the algorithms generally do very well for the first few iterations 
but become very slow near the optimal solution. But if p is large then there will be 
significant improvement in the objective function value even near the actual solution. 

We now discuss line search methods or one dimensional search methods which apart 
fom being useful in themselves are used to determine the step size 4 mM the basic 


scheme 
HD = x) +O - 


93 Line Search Methods for Unimodal Functions 


‘nt of a unimodal function of one 
These methods are used to locate mip (or max) pom t] — R and we 


j ction f : la, 
variable over a interval, ie. we are giver © Ara function f 
Wish to locate the minimizing point Xmin SU° 


f (Xmin) = Min f(x) ; 


a<x<b 


(9.2) 


— R is said 


F | The function f : [a,b] ila 
VS nn 7 ~ n Fadi /T¥ Tt... 2 . ih . 
veünition 9.3.1 (Uni ction if ît has only one mode 

} fees fi, he spe rf Í C p 4 i L 
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Similarly we define a weg may not eV 
i sable =- in fact 1% , 
not be differentiab 


ion. Note that a unimodal] functi 
continuous, as depicted in Fig 9 ig 


\ 





| 
. | 
| ' 
| | } 
i 
| ! i 
Ee 
a a b Sse : 


Basic Strategy 


Let us choose two distinct points (say) x; and x2 in [a,b] i.e. a < x1 < xo < b. Let 
Xmin denote the point giving the minimum value of f(x) over [a,b]. Since f is unimodal 
min function, it is clear that 

(i) f(%1) < f(x2) > Xmin € [a, x2] 

(ii) f(x1) > f (x2) = Xmin E [x1, b] 

(iii) fa) = f(%2) > Xmin € [x x2] . 


as case (iii) is not that likely). Then we 
nction at two distinct points, the initial 
has been reduced to (x2=a) (i.e. the interval 
- the interval [a, b] is called the initial interu 


search length (b—a)(i.e. the interval [a, b]) 
[a, x2]) or (b—x1) (i.e. the interva] [x1, b]) 
of uncertainty. The basic question 


. . 
say In lin 
x ; wile! OA ad AC i choosi 


ng points x 


CAN C ete rmina 
Ol determine 










vS DINALECST DOS 
“vs > s Te Ker al Ie 
Not se Ji 1 i F 7 

> LNOCLA Tniark «as. 7. 
wULe inat We do 
T 


j Lilt TRLIQT | a 
á * AAA) J iW yA 

aes Pa 

} {a ` ACI 

SPECI 


| E> PE ven 
S11 - | 
IU U L 





G 
Sam aa 









i } a i i 
AN Ar ARo x A ow J s Us ' ï i ts 
od ue upon how the N trial Po% 
E a ne T 
el i, eve A i Py i 








~ i PN A i re. a 
{ d | 1 co ) J P M 4 f j“ s F, < t n ni 
sE LUNCTION J7 au | 
- A UALL PS pean } at 
d A an? 


~ 





ri 





Scanned by CamScanner 





afl Golden Section Rule. 
Ua Fibonacci Search Method. 
the Fi 


i 
p 
K 





B ing the aforesaid methods w 
he ower limit of the search int 
= upper limit of the search in 
— 1 trial point at the kth iteration 
44 = 2™ trial point at the kK” iteration 
E= f(%px) value of the function at the 1% trial point 
a= f(x, k) value of the function at the 2" trial point 
Ik = Xuk — XLk = length of the search interval at the kK” iteration 
(so h = (b - a)) 


I- = length of the left part of the search interval I; 
k 


e Shall make use of the following notations 
erval at the kth iteration (so XL1 = a) 


terval at the kth iteration (So xy = b) 


E $ length of the right part of the search interval I. 
k ‘ 


R 
Ik 
er Sa 
L ! 
l2 
eS I 
w I 
t | | 
i ie j rR | 
a | Lise | 
=~ | E 
l | | | 
Xq k= Xp,k+1 Xq,k+1 “Uk 
ak X pk 
Fig. 9.2. 







m 
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tained from the previous iterat) 


int i On, | bn — 
i nt 15 O ii) p= 
at the K” iteration, the other trial po! Mey E 
fe = Ip t peg TON Be b nTn 
k k+ | : F 
(see Fig 9.2.) bah 
I am 
(iii) oh. = fi a c (constant) for all k. | 3 
That k+2 E ) if P i Z 
This is called the golden section criterion. In general, a point w is said to divide the | 
interval [u, v] according to the golden section criterion if we have | | | 
length of the whole interval _ len th of the bigger interval 
length of the bigger interval length of the smaller interva A 
‘ 
: ee) 2, Wu) 
—— @=0)~ @-») 
Thus we wish to choose trial points x»; and xq x according to the above listed criterion | 
(i), Gi) and (iii). From (ii) we have 
Ik = Ik+1 + Ik4+2 | 
l.e. 
e eel T | 
Tk+2 Ik+2 { 
i.e. 
f 
Tk lk Jie P aa 
a ee (ba — 
. Ik+ı Ik42 Tks (by (iii) : k 
i.e. 













c*=c+1. Therefore c= TAR 





Since c can not be negative, we get c = 1+ V5 
is called the golden section ratio and it has a long 


yh’ e = 0.618 . 


= 1.618 (The number c = 1.618 
history - something to do with out 





T =) A =) ee a { g ; 
a AOG a A UA ANN 1,\) (2) Eee Ea } i 
a Cl i} SLIC © kan N f) | [oem NO i av ` - Ne < a m n 
© SEIT Ae tee? eet Py a 
7 Ê A \! d 
a£ 1 es o% Fol 
siddin a O 
T 





LA 


rval is [0, 1]. Then due to the abo 
a Ai r 
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step 6 Use the following relat 
P 9 Use the following 
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Xp,1 = 0.382 


eral interval [a,b], we can then have 


gor 8 8°” 





Xp = b-0.618 (b — a) 
Xq1 = A + 0.618 (b — a) . 
th us in general, 
Xpk = Xuk — 0.618 I, 
Xqk = XL k + 0.618 I; . 
The stepwise description of the method is now as follows 
step 1 Input data- x11, *u1, €, f (Here e > 0 is the tolerance to be prescribed by the 
A Compute the first two trial points xp, and x41, where 


Xp = Xu — 0.618 (xu = x11) 
Xg1 = XL + 0.618 (xu1 — XL1). 


Set k = 1. 

Step 3 Evaluate the function f at the two trial points x); and xqx. Let 
Enk = f (px) 
Ea k =f (Xq,k) : 


Step 4 Test the interval which contains the minimum, i.e. if Epx < Egk, go to Step 5 


otherwise go to Step 6. 
Step 5 Use following relations to update the data 


XL Seok 
Xuk+1 = Xqk 
Xqk+1 = *pk 
Ene = f pk) - 
Ay Eg r= f (xq,k+1) Says * 


‘ons to update the data 


XLk+1 = Xp,k 


bod © OS ee 





a xå 
Pak E 















N 


Step 7 Test for the end of optimization , i.e. if Ik < €, go to Step 8, otherwise set k. 
k+1) and go to Step 4. 
it 8 Output XLn, Xun. Then Xmin € (XL,n, Xun) and Emin < Min( Ey n—1, Eqn). 
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ld ection rule fn = | 1 yE 
Remark 9.3.1 It can be noted that for the golden s i Lee : 
(0.618)"", Thus knowing I; (the length of the initial search interval) and JE (the lenge 
of the final search interval as desired by the user) we can find out the number Of iterg. 
tions as well as the number of points at which the function is to be evaluated. Note thay 
if we are going upto the 7" iteration (i.e. getting x7 and xuy) then i shall be havin 
n =7 functional evaluations, namely, at Xp, Xgi, and one each at 2", Se 4th sth an 
6'" iteration. To stop at the 7™ iteration we shall be computing X»6 and X46 and En, < 
Min (Xp,6, Xe) 


Thus N functional evaluations will mean stopping at the N" iteration and computing 


points upto Xpn-1 and xpn-1. Also Emin < Min (Xp N-1; Xq,N-1) and there are only (N-1) 
interval reductions. 


Example 9.3.1 Find min x2 over [-5, 15] by the golden section rule. Take € = 1.5. 


Solution Following the steps of the golden section rule we get 


k XLk XUk Xp,k Xg,k Ep,k Fak L/R 


POO. 1500 26A 7.36 6.96 94.1 L 


N N 

2 -5.0 7.36 -0.2 2.64 0.077 6.96 L 
\ 

3 -5.0 264 208 -0.27 4.33 0.077 R 
l 4 

4 -2.08 2.64 -0.2 0.84 0.077 0.71 L 
N N 

9 -2.08 0.84 -0.96 . -067 0.92 0.077 R 


L va 
ee 0.84 0.27" 9.45% gr, 0.023 R 





Thus Xin € [-0.27, 0.84] and E;n < Min (0.07% 


llustration Normally the function im 
1] en | nu ch SI aller e.g. € = 0.001 e 


-o K w! 
a a m i 
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VIS : 
© Set k phe Fibonacci Search Method 
~1), discussion wi 
th r 
è continue our n Wi egard to probl 
Be which is based on the Fibonacci sequence. em (9.2) and present another approach 
618, £ ; 
S ion 9.3.2 (Fibonacci S 
(the pefinition , Sequence). Let Fy = 1 r ee 
iS a 0), Then {Fn} is called the sequence of Fibonacci nti bee and $ Sea +Fj-2 (i 2 
“ii oak sequence Thus 4 i oe, sequence is {1,1,2,3,5.8 13 2 y 7 E i xen 
= — — = LLL | J z». o| WHhETE 0S L =], 
! be heme h= 2 ! a #5 = 6, Fe = 8, E7 = 21... and so On. i 
th g 
» 5th we describe the Fibonacci 
Me. and Now BP asi the golden a search scheme. Here the first two conditions are 
2 exactly sa g Section rule, but the third condition changes. Thus we 
have 
COTA li) I = ES for all k 
es ate faa tlk 
uly (N— ale tkt 
1) j Ik n—k+1 h 5 s 
(ii) — = -e N ere n denotes the number of iterations to be performed. 
lk+1 n—k 
= 1,5) = Wenow give a justification of the condition (iii) given above. If we wish to stop at the 
n" iteration, then 
Tt] = In == 1 In 
h = [y =, Ii 


ln-1 = In + ln = 2 I, 
In-2 = In Ta In-1 = 3 Ín 


In-3 = 5 Ín 
In-4 = 8 In 
lh = Pln. 


Thus we can see at once that the role of the Fibonacci numbers, because )/In = Fn, or 


Fk Prk Hence the only difference between the golden section rule and 
Tk41 Fn-k 
the Fibonacci search method is the requirement 


C= 1.618 for all k and all n. Here, in the aoa pio, 
_ both the current iteration k and also the totai n 


in general, 


. k 
(iii). In the golden section rule PT 


; search method this ratio depends on 
ns n to be performed. 









n-k_ for determining 
all be using the number JE n—k+1 





— f 1 
lerefore instead of 0.618 (i.e. z) We es 


a | i $ ė 
sy x 4 p op” 7 
ans d Å AA kS Pa - a 
k bp >t k oy n M q a 
WAU Ank. A DA 
; ak g Hw E 






hod is as follows 





=T - 


ae 


Di 


= © 
=a < >: ~— P U 
amt E y ae 
priat,z Bihonacc: 
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Xp = FUI © 


Fn- (yia — X11) 
Fn 





Fn-1 
= —— (xu _ XL1) 
Xq,1 = X11 T Fn 


Set k = 1. 
Step 3 Evaluate 
Egk = f (Xq,k) 
; l E g 

and test the interval which contains the minimum. If Ep < Eg x, 80 to Step 4, otherpis 
go to Step 5. 
Step 4 Use the following relations to update the data 

XLk+1 = Lk 

XuUk+1 = Xq,k 


x Xuk i: Ik+1 
+1 ~ ŽUk+1 ~- + 
Ze Fn-k+1 





Xg,k+1 = Xp,k 
Ep, k+1 = j (Xp, k+1) 


Eg k+1 = if (X%q.k+1) = En k a 
I 
Step 5 Use the following relations to update the data 
gives 
XLk+1 = Xp k fale 
ee itera 
E A itera 
Xgk+1 = XL k + eS Pe we a 
p Prk 
pk+1 = f (Xp k+1) = Ea k Ret 
Egket = f (Xgk+1) - p 
Step 6 Ifk < (n—2), the 
= ) nsetk=k+1 and For t 
Step 7 As Fy = Fi = 1, it can be see 80 to step 3. If k = (n — 1) then go to Step? | or t 


k that at the (n — 1)" iteration, the two trial it ca 
= e the AEO = xy n1: To make them ) 
For this we look at the (n — 2yt ed positive number e (say 0.01) to one of thes? | 


Se€archir j ; iteration. If T 5 . n 
search ae the left then e is subtracted Pr ta (i.e. at the (n — 2)" iteration) we 2 


im Xp n-1 and Xq n-1 Will come out to 
inct we have to add / subtract a 







T 
thon ; S 
its i 


{ e Aay 
> sS il Am 4 ; 2V 7 y 
decd ain Ae A, 


'*yn-1 and x qn—1 and h a eit is added to "a 
n SA - Hence XLn and XUn- Then 












M A Ay bes 7 a 
“Ln X Un) 


—_T es 





Ss. F WEA i 
Àk an-1} 

=) = 

‘ 


-SLS 
7 


ee: i -~ pa in 
ue value of n can be computed 
-A 


EU j 
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voa 
| 9 aking n = 7, i | 
to 9.3.2 Taking find min x? over [-5,1 5] by the Fibonacci search method. | 
aa Following the steps of the Fibonacci search method we get 
s k Fn-k/Fn-k+1 Xk Xuk x k 


4 Xa Ep k Eok L/R 


1 18/21 -5.0 15.00 264 7.3g 





6.88 546 L 
N 4 

2 8/13 -5.0 7.38 -0.24 262 0.058 688 L 

E 3 5/8 -5.0 2.62 -2.14 -0.24 467 0.058 R 
L z 

4 3/5 -2.14 2.62 -0.24 0.72 OOL D L 
\ x 

Beene Sa, » S214 O72 -1.2 .. 024 AAA Ogee 
7 A 

6 172 = 2072 0.24 -0.24+0,01 0105820058" R 

T 1 LAO O 





from this table we observe that Xmin E€ [-0.24,0.72] and fmin < 0.053. 


In the given example, I; = 20 and therefore if we take e = 1.5, then I, = 1.5. This 
h 


gives — = = — 13.3 which from the sequence of Fibonacci numbers determines n = 7 
as Fg = 13 and F7 = 21. Also as explained in Step 7, the two trial points at the (n — i 
iteration, 1.€. Xp,6 and Xq,6 both becomes equal to 0.24. Therefore we look at the (n —2) 
iteration, i.e. the 5th iteration in our example. As there we are seeking the right part, 
weadd 0.01 to x26 to get new X46 and then continue as explained. 


Relation Between the Fibonacci Search Method and the Golden Section Rule 


1+1/c = 1.618. Also 









-x 


ae ave C = 

,7. | forthe constant c appearing in the golden section rule we h <a 
al TER at | now like to compare these tw 
rial oe ten be proved that Fy ~ —= (for large ne Wesabe | 
a C p luations. 

>se. Xthods for the same number of function eva Fibonacci search metho ee 


factors for the 


J 
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A 
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st Rp and Rgs denote the reduction 
en section rule T espectively. Then 
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le VJ, 
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Therefore a 5 


ft C 
n ck EE A T A 
Re ctl y5 ye 
Thus for the same number of function evaluations, the final search Interval for 4 
Fibonacci search method will be 17% smaller than the one obtained by the Zolder 


section rule . 


9.4 The Steepest Descent Method 


The steepest descent method is one of the oldest gradient based methods for Solving 
an unconstrained optimization problem. This method is extremely simple to implement 
and therefore has been used widely in various applications. The only drawback of the 
steepest descent method is its slow convergence (as its order of convergence is one ), 
But this has motivated researchers to develop more advanced algorithms by modifying 
the basic descent strategy of the steepest descent method so that these algorithms have 
Superior convergence properties. We shall certainly discuss some of these algorithms in 
the subsequent sections. 
Let us consider the unconstrained optimization problem (UMP) 
Min f(x) (9.3) 


xeR” 


where f has continuous first order partial derivatives on R”. Then the basic scheme of 
the steepest descent method can be described as follows 


where d® = -V f(x®)/|| 


Vf(®)|], is the direction of move ize Oy 20 
i m Ak 2 
is chosen such that ent and the step size a, 





h(a) = Min h(a,), 
a 20 


Here the function h(a) is gi te 
x n h(a) is given by h(a) = SEO +a, d®), and we « top when [IV FOP 







e steepest descent method could be 
iin , ance e > Oo Set k 0; 
te aD = O p guy E DIV I, 


a EES ~i 
` "7% "> nat b ` i 
YuLED Py 4 DHvaluate i 
Ee > a YALUGAUT A qh 
1 e ¢ 4 \\s 
\ | Le 
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A Wnere oo: SA: 
where ay z 0 is chosen such that 
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2 and 3 above are repeated till ||V (x(h) 
| f < é, k . 
at of the solution set and therefore. if f J à i < e. In that case x becomes a 
Í í E oa raiad 2 JS A Convex function then it becomes an 
nal solution OF t svrained minimization problem (9.3), 


o natural question here is to justify the particular choice of the direction d™ at 


. k > = i j 

e he current point x9, i.e. why should we choose dé = -V f(x) /\\V F(x)I| ? For this let i 
n A recall the definition of the directional derivative of f at the point x in the direction Hi 
es .» halt 
das i 


Of| f(x + ad) = f(x) 
A lim TY taa 


and note that for the case when f has continuous first order partial derivatives, we 


ig 
it have 
le 
) 


x ar A 


af) dt 
: Dale Way N 
re V f(x) 
Therefore for a) = “IV AGI we get 
: 2 
OP = IGE -IIV f(x)I| < 0. 
Ad |= xh) IV F(x )II 








4 i As for d® wa (x) ile < 0. it makes sense to move in the direction 
be Or Se leer ; 

cise IV F(X)? Ad Irae oe mae 

5 i 1 derivative of f at x 
vi J® if we wish to minimize the function f(x). In fact DE a J 
X ; d = ————— because for the given x the optimization 
E in the direction d is least when 4 = ~ IFF) 


problem 
Min d Vf) 


subject to 4 
dl <1. ag 










TSO itn the optimal vale AFON <O 
has the optimal solution d= IV f(x)Il i eget convinced that if we are to minimize 


NETES we : direction of 
Usin g any of the above ar guments t point x), then we must ove in eet ction, Le. 
ally) a function vi liars: s d® as the unit vector in that dire Ba 


‘a 


a 


lac IA hs dis a f fro Y 
Cally) a Tunction | Vy 
I 3 T 


PRN 

Wa nawotOre 
1 ¢herelolre 
s J a > 1 ~ = 





t = 
ig 






=í 





O E 7 he 
eels oosing t ne step SIZe Ok: a) hengt 0 
Anre JOL \4 a e x” + Xk d 1 Ok an 7 


manner all pc 1r U 
TE COlisaews SC * =? 
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ting at y and in the direction d™. Here it sh 
ne x) is a function of x, i.e. the function f isn 


‘action d), i.e. there must ey; 
, in the direction a’, 1.e. ust exist 
: rease for all time to come, 1 i eg (k) 
fy ree -Ai function will not decrease 1n the direction d frk common 
wd omnes that the easiest (not necessarily the best) Way $i Y Whe Oy 
ponams all values x) + ar d® , i.e. consider the function hlag) = f(x + ay ath 
to consider s minimum. Here we should note that h(az) is a function 


oose Qr for which (ax) i el 
` one ioe namely a, only and so this minimization can be done numerically z 


well. Therefore @ > 0 is chosen such that 


h(a) = Min h(a) 
&>0 


ie. all points on the ray origi 
a that as f is nonlinear, V f ( 


as described in the basic scheme of the steepest descent method. f 

We should now go back and try to verify which of the so called ‘desirable propertieg 
the steepest descent method possesses. Without proving any of these we shall below state 
the given result (and this we shall follow for all algorithms discussed in this chapter) 
for the steepest descent method. 


Result 9.4.1 The steepest descent algorithm 

(i) has descent property 

(ii) does not have quadratic termination property 
(iii) is globally convergent and 

(iv) has order of convergence p = 1. 


Therefore if we are using the steepest descent method then we can start from any 
point x® and as we proceed, the ob jective function value will decrease, but the algorithm 


may take lot many iterations near the optimal solution. Also it may, in general, take 
more than n iterations to minimize a 

We now illustrate the workin 
below. 


positive definite quadratic form of n variables. 
g of the steepest descent method for the example giv? 


pie 9.4.1 Use the steepest descent method to minimize f(x1,x2) = 3x? — Ani 


a FA PS 
A j > a 7 
DETTA y E / aa eras os Ps” r a k Jd à. 
7 Si miD | 0 - ~ | f f A AA E Aok F 2 
7 T ‘FAL TO OVET (X1,X9) E | . 
{Cf 4 ~ A = a p; 


bea 







€ a ee 
OMA BESS a CONnVe’ 
r "À Y y vw, 


z hh yee een ; Ay od 
“4 LUNCtIO“N La nd so the steepest descent ae 
otartine fron (C ENE s & 
as trom x” = (0,0)! and following Steps 
y Ál ` 


i 
A iw 
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ee Alle 
Wd, k ee vie eS ae i 
Qe eS ee a S se 
Some l oy OF een ae 
Seng. 2 (—2/3, 0)" (0,8/3)" (0,-1)?_ 2/3 
ald}, 3 (-2/3,-2/3)" (8/3, 0)" (-1,0)F_—1/6 
4 (-10/9,-2/3)" — (0,16/9)"' O A 
5 


(38/27, 10/9"  (16/9,0)? SPO 1/6 


Remark 9.4.1 The function f(x1, x2) of Example (9.4.1) is a positive definite quadratic i 
form in two variables but its optimal solution has not been obtained in atmost two Hit 
iterations. This illustrates that the method of steepest descent does not possess quadratic i 
termination property. i 


Remark 9.4.2 Looking at the table for Example (9.4.1) we observe that the directions 
4® are repeated alternately (-1,0)",(0,-1)', (-1,0)',(0,-1)7 etc. Is it a matter of co- | 
incidence or is it always going to happen? Well there is speti more deeper T 7 
the sense that any two consecutive directions d® and d®*® given by the Beene a 
method are mutually orthogonal (see Theorem 9.4.1 below). li in RY if the Bh i 
two directions are d®) and d2 then d® has to be d™ and d™ has to be d\“) so on. 





: -e 3(1) i 
: a this repetition may not be in R and higher dimensional paces because pen A d p | 
: 3) different from d\, which is orthogonal to | 
| 2) are orthogonal then we may have d!, gata 3 
’ = 0 So the pik a thing is the orthogonality of consecutive directions and NOT their | 
: 7 i °,° 
i A alternate repetition. 
wany k+1) cutive directions generated by the steep- 
im | Theorem 9.4.1 Let d and d“*") be two conse 
‘take est descent method. Then Sy ie Psy 


zy x 
~ 
aed 


ne 
where <,> denotes the standard inner product n R”. 


— Min h(a,). Then 
Proof. Let @ > 0 such that h(x) vin (ax) 
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-T Be 
i.e. ee > ae = | 
W +a, d®) d®| _ = 0 ; 
i.e. (V fO? + ak a=; . 





T| Vf(x®) E 


=a; 


ie (WA +m aMTDY f= 0 
i.e. SEs 0 . 


a 


Remark 9.4.3 A geometrical interpretation of the above theorem could be that the vec- 
tor —V f(x) which is normal to the surface f(x) = constant at x”. is tangent to the 


surface at the point x*+)). Therefore in R*, the movement of the steepest descent method 
will be as shown in Fig 9.8 
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| 





Ne wton’s method for finding real roots of the 
basic scheme here iS Yew. = Yk = gly) lo 
yrrent approximation, 

It is natural here to be somewhat curiou 
UMP’s and the classical Newton’s method for root finding. For this | 
for solving the unconstrained minimization problem (9.3) we weit z at findi 
point x € R” such that Vie) = D. Therefore the basic problem of Bian a j ; 
here very naturally except that rather than finding the roots of a sin e ; Ar z 
have to find the roots of a system, namely Vf(x) = 0. Looking at se t tines 
scheme of Newton’s method for gy) = 0, y ER, we immediately vet tbs svete 


equation @(y) 


= 0,4 € R is well known. 
Yk) where y 


k 18 the current iterate or the 


S t 3C i 
to know the connection between solving 


t us recollect that 


yk+1) = ylk) (HO) V f(x) 


for finding a solution of the system V f(x) = 0. 
A more acceptable mathematical argument for the above scheme could be as follows. 
Let us approximate the given function f (note that f : R” — R is the given function in 


problem (9.3) which is to be optimized ) in a neighborhood of the current approximate 
x by the truncated Taylor series to get 


Ppa) (x — x) )1V f(x") + s(x -XT H x)(x- x). 


Therefore if we wish to minimize f(x), it makes sense to minimize its quadratic approx- 
imation q(x) where | 


1 
g(x) = f(x) +(x- x®)TV f(x) cc a(x x®)TH KE — x®), 
Let this minimization be done exactly and hence 


Vq(x) = 9, 








sik fo) + s(t ~ x) 2 Hy(x™) = 0 


yK) = (He x | AE Ge); 


} 
i$ 







A, 
OEN ae, at 
“f y Í K In s i f 
i S o To "E P La i 
-T 
E E Í 


oy kee oa Í DaS A Hei vas 
AAT 7 f 
. OETLELILS | oO p 
b atin pada T z paji j 
be 


i > 
ArT al ‘Vv MAC 
k : eee 
=. Pe mS; 
d it { 


me, i aye r 
a an r 
1eg i 


g ern 
M Ba aca 
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n’s method) because the system ee fim ‘; a nae Syst 
fore Newton’s method for solving UMP 5 ge quadratic terminati, 
nimizing a positive definite qusdestte dormea variables i 
will take exactly one iteration). Also it will have order of Be pr P=2 andity. 
have descent property, but it will not have the property Of 810 $ convergence (because 
the standard Newton’s method for root finding does TON have this pr oper ty). Therefor 
except for the case when we are minimizing a positive definite quadratic form (jp the 
context of root finding it means solving a linear equation), we cannot start the method é 
from an arbitrary point x9 ER". 
Therefore if we are minimizing a positive definite quadratic form by Newton 
method, then not only that we can start from any arbitrary point x € R”, we also i 
know that we will get the optimal solution in exactly one iteration, 1.e. x“) has to be 
the optimal solution ¥. However if the function f in problem (9.3) is not a positive 
definite quadratic form then there are major problems with Newton’s method. Apart 
from the fact that in this situation we cannot start from an arbitrary starting point x 
(x® has to be ‘close’ to x), there are serious issues with regard to the Hessian H,(x"). 
Why should H (x) be invertible at the point x) for every k ? It may be reasonable 
to assume that H,(x) is invertible in a neighborhood of x (because f has to behave like & 
a ‘parabola’ or a positive definite quadratic form around a point which is a strict local 
min point) but assuming its invertibility for every x“ does not make sense. However 
Newton's method has order of convergence p = 2 and this is very attractive because : 
there will be significant improvement in the value of the ob jective function even when y 
a 
n 


equation by Newto 
of equations. There | 
property (in fact here for mi 





we are close to the actual minimizing point. (Recall that the order of convergence of the 
steepest descent method is p = 1 which makes the algorithm very slow near the actual 








optimal point). 
me 3 Pi a we above we should try to make certain modifications in Newton’s method 3 
of aea : a A modified Newton’s method) which has all the nice properties E 
eee uN ha property, quadratic termination property and order of is 
point x € Rn) an om also globally convergent (so that we can start from an arbitrary 2 
related with the ne ape if the proposed method also takes care of the issues uf 
inve 

ne eee tse of the Hessian. We discuss modified Newton's 
_ oxample 9.5.1 Use Newton’s meth : W 
R2 C ET a od to minimi ny 
R2. a} mimize f(x1, x2) = 8x? —4x4 x2 + 5X5, (x1,%2) € he 
ai | q 

3 é > re colin ake c jug dratic form in two variables E 
| cs) Poin x € R2 and use Newton’ w 


m Lll O E` a J t 
i = i iil. 
-© 7 7 
a 
a 
er 


at. 
e bn m 





ems o 


~~ 
> 
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V F(x) = 16x1 = 4x, 
4%; ei 10x3 
=(72 0 ae 


H (x) = 16 ia) 


343 








eg sill) 


BOO) a | ae 
) 144\4 16]: 


Then 
x1) — y0) _ (H¢(x))-1y F(x) 


15) 1 (AO 
444\4 16]\0 


2 
0 
0 


| 


/ 


giving X; = 0,X2 = 0 as the minimizing point. 


96 Modified Newton’s Method 


While discussing Newton’s method in the last section we noted certain limitations 
as H,(x) may not be invertible at the points x. There is something more to it - 
namely, even if we could guarantee the existence of (H Aa ns it is not necessary that 
-(H;(x®)) -1V J (x) will be the direction of descent unless we could also guarantee that 
H¢(x") is positive definite at x“). For this it is enough to check that —(My(x))7!V f(x) 
is always a direction of descent for any positive definite matrix Mg. Another difficulty 
with Newton’s method has been its lack of global convergence property. Keeping these 
things in mind, the following modification to Newton’s method is suggested 


(K+) a xh) a Ak My (V F(x)) (9.5) 










as (K) lained 
= where M, ; , ae i obtained from H;(x*’) as explain 

mere My is an appropriate positive definite matrix ( PERR ME 

tere) and 3g > 0 is the step size which is chosen as m the steepes  1.€. 


21 | IS C 


ASOT L wat on AOA 4. à p 
nagas goas Mi-in te eine fw ~ 
o ay>0 


et hn 


= n a 
g: ô » r R EE bk 
-iuu k) i aA Cay A (V 1 ow (k) 
> A fi D 7 K h Ar pi 
FAYI Ero A KS 1. A i J 1h . 
x) SAV Sed AKN Y J oN 
Wy -i aià ‘<a i 


— 


AN 

k k a AET) bois . H (k)\9 
i a Choose M pa gr € 1 the atr plat ai 
ut, at x and if at some point 


E * 

on 

aD Boman l ea 

x ah 1 $ 
Ta TEID. O 
ting ; 






APETA SE Ee iy ee 
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RE eres eer o 
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y that its inverse is positive one But Hy (th 
= “oly real symmetric and hence all its eigen values are real. si we shal} ; 
=Ñ aptr mee matrix of the form ex I (€ > 9) j em J o Where 
Fy = (ee I + Hy(x™)). Here ep > 0 is to be chosen so that ; aa . Ar g | Fk becom 
strictly positive and therefore F and Mg become ili j r ep í i a SIVEN the 
point x, we fix a constant ô > 0 and calculate all eigenvalues o f x’). Let Ex be the 
smallest non negative constant for which all eigenvalues of the matrix Ey I +H g(x) a 
greater than or equal to 6. Therefore, once € P ig been chosen in this manner, we 
take Fy = (€p 1+ H,(x)) and My = (ek I+ Hp). | 

It can be shown that with the above modification the method described above (called 
modified Newton's method) has all the nice properties, namely it has the descent prop- 
erty, it has quadratic termination property, it has property of global convergence and 
its order of convergence p is 2. However it is still not practical because to get M; we 
need to compute all eigenvalues of H p(x). 

Therefore we now have two basic gradient based methods, namely the steepest de- 
scent method and the modified Newton’s method, for solving unconstrained minimiza- 
tion problems. Whereas the steepest descent method is simple to implement, it is not 
very good from the convergence point of view, as its order of convergence p is one. 
The other method, namely the modified Newton’s method has all the nice properties, 
including the global convergence and order two convergence, it is not of much use be- 
cause of the effort involved in evaluating F and hence My. So the best option sco to 

j be looking for those algorithms for UMP’s which are somewhere in the middle of the 
spectrum, i.e. these are simpler to implement (unlike the modified Newton’s method) 
and have better order of convergence (unlike the steepest descent method). There are 


a whole class of such methods, namely conjugate direction methods and quasi Newton 
methods. We shall discuss some of these in the next two sections. 


xy it is invertible it is not necessar 


) 


9.7 The Conjugate Gradient Method 


Here we present certain basic principles 


) : of conj j i 
UMP’s and discuss the conjugate na njugate direction methods for solving 


method in detail. As mentioned in the pre- 
an the steepest descent method (in terms of 

ON plement than the modified Newton’s method. 

Ven} n] t on 9 ar. E ari es 

iny two non-zero vectors (y n nS): Let Q ne an (nxn) positive definite mx 

or to Q, if dD)! Qd% =0. 


K vious section, these methods are better th 
~ order of convergence 






) and are simpler to im 
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LORNO A 
IA Ab DEI Gsi 





aYa n < I + r i Fá ; $ p SNF ies. $ 
"7 X, SOMetimes we also call them 
ms =i 


ra 
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consid 
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l ition is extended to mara 41 
above defini O more t} 
eet A)? Q d) = 0 (i # i Now o © Conjugate if every two of them 


are tto Q’ but rather just write ‘conjugate’ in cage ¢ 
0) q0) (k) 
Tol Let {d' pO a a } be a set of 
esult 9. | p S asen oy (Kidin vectors which are cok 
with respect to a given positive definite matriz O. T | hich are comity 
i early independent. Q- Then the vectors d, q0), ..,,4® 


are lan 
f. To prove the above result, we have to show that ag dO 


+a d”) +... +a; d® = 
implies that each aj = 0. For this consider the equation i i e 


æo d® + a A ay d® =0 
and pre-multiply both sides by (d)’ Q. Then by the definition of conjugacy, we obtain 
a; (a) Q a) = 0, 
which gives a; = 0 as the matrix Q is positive definite. o 


Now to understand the basic principles of conjugate direction methods, we first 
consider the quadratic case, i.e. the problem 


iT T 
Min -x Qx-b'x 9.6 
eR" 2 Q Oe) 
where Q is an (n x n) symmetric positive definite matrix. As Q is positive definite the 
objective function of problem (9.6) is strictly convex and therefore problem has unique 
minimizing point x € R”. 
Further the KKT conditions for problem (9.6) give v( 
ye =o 
Which implies x = Q-1b (note that as Q is positive definite, Q } exists). PRAE 
The above discussion shows that finding the unique T ae in x % i A 
) l ; j : m of equatio = 0, 
(9.6) is equivalent to finding the unique solution of the sys i as deve. bene 
namely x = Q-!b. Hence there is no theoretical difficulty as 


i j ] jugate 
Puted = medi main purpose of introducing conjuga 
m o E e lly finding Q7! explicitly. This is some- 


i e. we do not compute the inverses 


5 x? Qx - prx) = 0, i.e. Qx = b, 


direction this unique x without rea 
D Fi Be ‘yri O timization, 1 j ; ` 
t in Se soir, of pivoting used in the simplex algorithm 

e that the 1 f the basis matrix explicitly. 


A O 
ute the inverse f 
) y A ET get hold of a set of n non-zero 
emonstrate that 1t w wee obt ined easily and this will 
desired % can be Optone, = 


Thin 


a 
fey 
` E 
e t Aaaa 





si 
~ ivr £ > =)! = 
Vo hy te 
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| A , -zero vectors in R” 
» a). dD) be a set of N non-zero | Which 
Result 9.7.2 Let roe - ch is the unique solution to the system we aS 
conjugate with respect to w. t of problem (9.6), ts given by TAR 
| : 





Š . . > bd om 
ique minimizing P 
or equivalently, the un 


Sl (d9)! b jæ. 


QOQI o7 


x= 
k=0 
vectors d,d™,...,d"- are linearly inde 


- Using Result 9.7.1, we note that the kts 
Proof. Using number, they form a basis of R”, Therefor. 


pendent. As these vectors are exactly n in 
there exist scalars @o, @1,..-,@n-1 Such that | 


X= a dO +a, dP +... + Qn- da0). | 


If we now pre-multiply the above equation by (d®)TQ and use the definition of conjugacy, A 
we get 
_ UOO 
ES @IQU® | 


But x is the unique solution of Q x = b i.e. Q x = b and hence : T 


— (aOb 
“ES AOT QAO 


which on substitution in the equation (9.7) gives the result. o i 


The above result can also be visualized as the output of an iterative process (see 
Theorem 9.7.1), which becomes very handy in describing the conjugate gradient method. 


Theorem 9.7.1 (Conjugate Direction Theorem). Let {d©, da)... d@-"} be a set a~ 
of n non-zero vectors in R” which are conjugate with respect to Q. For any x® € R g ete 
the sequence {x} generated according to 


HD) eh a gt, 


(oT qi 


ki (d(k))T Q di)’ 








mra 













B 
ad 


; 4 RLT 
- the saetem : a E, 1,6. 


Bae i j of R": 
y (as beio e) form a basis 


& = 


(9.8) 
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ving the above equation by (d®)TQ Weta 
’ ge 


(TOR. 0 
Ar = x — x0) 








xD — x = ad 


(2) — y0) — (1 
sa EY ce Ng ad) — x = ay dO 4 ay da) 





G _ (0) _ | 
y x = a9 dO + ay d+ + Opa YD (9.10) | 

Again, the pre-multiplication of both sides of (9.10) by (d®)TQ gives 
(dQ — x) = 0. (9.11) : 
Therefore | 
(k)\ T — x(k) 4 x(k) — (0) | 
za (dO Q — xP + x? — x) (9.12) | 

(d®)TQd® 

ie, | 
TOR — x a 
4, — CORE) oy (9.:11)) Í 
4 (a®)TQa® | 


(a) Qe - Qe) 
~  @myFQa® 


(dO - Q) 
= UO 





OA 


f g —_—_—— ee = — a PS 
o =T 
Ga ( ~~ (4) A C/G ` 
> E SS Ay | b ee 
K \ un < > J = 


Ean 
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Now by the iterative scheme E 
= 
y0) = x) = Ai di afe aa] dq) wie P Un] d à 





on 
Also as given by (9.8), 


=i 
zO ja 4 te Ay (Ce An-1 d") | (9.14) 


But as shown above, A; = a, for all k= 0,1,...,n — 1, and hence from (9.13)-(9.14) 


xi") a xO) = y — x) 


i.e. i) =z 


In view of Result 9.7.1 and Theorem 9.7.1 we conclude that to solve problem (9.6) 
we need to obtain n non-zero conjugate directions d, AD. a dD, But how to deter- 
mine these directions is the main question now. The applicability of conjugate direction 
methods will essentially depend on how simple or difficult is the method of finding these 
conjugate directions. In the following we present one such conjugate direction method 
where the conjugate directions are determined by using the gradient of the function f. 
This method therefore, is appropriately called conjugate gradient method. 


Conjugate Gradient Method for the Quadratic Case 


Let us again consider the unconstrained optimization problem (9.6), i.e. 


Min ; x” Ox — ET x 


xER" 





s an (n Xn) positive definite 
the above problem is equiva 


É: 
| BE 
matrix. Let us also recall that solving poa 
nique solution x of the system Qx =b. 


d and then later Justify various steps 









involved therein, 
Step 1 Ck cose xO) E€ R” arbitrary, Define d = 


yet f at ey Pla 
i nF { )Sse the 


-90% =p Qx. Set k= 0. 


yee m E x 
Ea > 1eme | 





a. 


eee 
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Step 3 Now, =2/3) _ [74 =| o 
-a-b ea | 0 j-(7) oa al 


0 sa| i aly 
es A = 4/9, 


g™ 

(g)T Qa” _ 

dO QO ~ -16/9 

and d™ = -g™ + Bod = }- slo) p | ai 2 ) 
-8/3}  9\ 0 / 


Step 4 Now we obtain x as 
x2) — x4 qq), But 


~(oDyT gq) 
ere. 

(d®))TQd®) 
Therefore, 


2) — =) 3 es 


? 


0 AN 8/3 


i.e. the minimizing point of Cay 2) ist = =D» X: = —2. 


aTa ao re ne ad Vf(x), ie. Qx-b = 0 as it should be. Also 
aaa aS to be true because the basi : ) 
gradient method is to generate directions d® such that dO) 40) dete a a i 
with respect to Q. OS kee A are conjugate 
Result 9.7.3 For the conjugate gradient me 
Y) (kT = 9 
k)\T p(k 
(ii) a = LEV ge 
(d®)TO dik) 
pay _ (gt tD)T okt) 


thod, following are true 


tt h DAN ZIA l as Se 
a’ Y P4 { fe Y at 1} J ~ P e = 7 aa _ 
+) Jagat (OL j B A Er 7s 
“a — J, (ys - rat), i The fe ust Ñ 
7 y6 Ji ie K p- yas 
i J's., N ~ IT f 
a aye : 
á T 
A 


y 


Junction at x® i e, R =Q x® -b. 
Cc 
(a E i 


E 


= 


A P 
| a 
K 
. 


“S tO use induction E f of th 
~~ duction. For full proo 
fe g` _ J 
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ve (i) we note that ay > 0 is chosen Such that 





h = ~ 
ia) = ain h(a) = Min fe + ad") 


aod therefore, dh(a,) 


= 0 


D4 at = =) 
ie (V fata p 
(k) — 
e gt)" 
Na - -2® + Bad) 
m (g®)T gl) — Bra (g))T gik-1) 


= ( gr gl) —0=( g)T of"), 
which proves the result a 
Next we shall prove (iii). For this we first note that the gradient vectors { g)} are 
mutually orthogonal, i.e. (gyre (k) = 0. This is because 


Qx*+) -b=Q (x 4 ad”) 2, 


1.€. Za 
ox) — b = Q x® -b +a Q d”, 
R k+l) — +m, ON 
gl +1) — g + Xk Q / 
Le. ee 
E-t (gt DT = (8) +d” Q 
Ras 






But qd = Ale) bas =i) wand therefore 
DTW 3 f ki ga. 


(olki ‘yl F! —d' 
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== (k) 
Kins, gh) = Au È Ox) — x) = Qad”) 


and therefore oat i A (e psig g(t, 
(Uk 


eo 
* GENOA _ (DHT ged — gë) 


A= AQAD ~~ AORA) 
C + (gK+1))T o% (dY) Od” 
j (d®)TQd® (o®)T oh) 


(gD) Tot) _ 0 7 (gDyT o(k+1) 
( glk) T ok) z ( oh) ) dt lk) 


as desired. O 


Fletcher and Reeves’ Method 


Let us consider the general unconstrained minimization problem 


Min f(x) 9.15 
and attempt to translate the Steps of the conjugate gradient method for problem (9.6) 
to problem (9.15). The first thing we note is that g has to be taken as V f(x) and as 
a > Ü is the step size, it has to be computed in the usual m 


cage : ic case comes from the exact minimization of h(a,), as 
a z quadratic case hlar) comes out to be a quadratic expression in ax. Also rather 
an the expression for fy in terms of Q, we can use the expression given by Result 9.7.3 


which involves 2) and gD on] 
y. Therefore it ; 
for solving problem (9.15), e 1t makes sense to have the following steps 


Step 1 Choose x € R” arbitrary and take qd 


| =O) aa à 
Step 2 Obtain a > 0 such that Sa WFGM). Set k = 0. 


RG) = Min h(a, , 
fie &>0 





Scanned by CamScanner 


U i . 
nconstrained Optimization Problems 353 
y (gD)? o(k+1) 
( ok) yt oth) 





fe, inue till we get a point x) 
<< 4 Conti g p x” Of the solution Set, Le. ||Vf(x”)|| < e for some 


ed tolerance E€ > 0. 
p We can show that n above steps produce a method which has the descent property 
also has the quadratic termination property, but is not globally convergent, i.e. 


we can not start the method from an arbitrary point x € R”. To make this algorithm 
oball convergent we Incorporate P meu s correction in the above procedure and what 
get is called the Fletcher and Reeves’ method for solving the 


unconstrained minimiza- 
Hon problem (9.15). Ti so called, Powell’s correction essentially tells that starting with 
„n arbitrary point x 


perform Steps 2-4 as described above till we get x”. Then we 

back to Step 1 above replacing x by x") and continue. Thus after every (n — 1) 
iterations, the direction is again taken as the steepest descent direction at the current 
point. Therefore the stepwise description of the Fletcher and Reeves’ method is as follows 


Step 1 Choose x) € R” arbitrary and take d = —g = -V f(x®). 
Step 2 For k= 0,1,...,(n — 1), 


(a) Set xt) = x + agd”, where a; > 0 is chosen such that 
hE = Min hay), h(a) = FRP + ad®) : 
Qk? 
(b) Compute gD = V f(x). 
(c) Unless k = (n — 1), set 
qt) = PAR + bd”, 


( gD) ged 
Pee ROO 


Step 3 Replace x by x”? and go back to Step 1 


a Rr 
i ~ 








Oi 
ws 
NU 


P i 

(FP) Method 

I Ay Lo f tA e H teii il DPE ‘a ) f e oO 
i tn pase. i.e. problem (9.6 
ratic case, i.e. problem (9. 


der the quadtat’ aR e a l 
SS eS its » Soar ee 
i (T i 
(3 R: 
C s 
l i i ~ 
MGa >. oe 1 af 
a Nes WwW feal 
y 


5 t ATI YY A r i 
ales ee Y AaNnDroxiiL a 5 / Vas 
m 4) a ution i á pa zL 4 
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-Í avniicitly. We | 

iding the computation of Q- explici y. Yo ghall toa 
matrices and thereby avo bove problem with the general ae DA J Pilih they 
the DFP method for the : raid oe unconstrained minimization problem (9.15) 
specify ee A Aas Also choose any positive definite matrix, So ( 

tep 1 Choose x 

ai = I). Set k = 0. W 
Step 2 Define a) = -Sg 


Pre j = h = (k) (k) 
b) ied ] where h(x) bab: ard ), Define 
Step 3 Obtain a, > 0 such that h(a) = at (i) ji 


Wwe May 


K)) (For the quadratic c 
where g® = V TOS )), (For the qua ase Vf). 


k+1) — yk) 4 Aq). Mkts 
A 4 Set i KED — x = ad and q® = gt) — g. Update Sk to Skat as 
ep ai 


pp) Sk g® (qP) Sk 


Sk+1 = Sk + ME) jp (qt Seq) . 


Step 5 Continue till we get x. In that case x“ = xX, the unique minimizing point of 
problem (9.6). 


Remark 9.8.1 We can prove that (Sklo is a sequence of positive definite ea and 
Sn = Q™. Also (p)Qp) = 0, for 0 <i < j < k and therefore p"’s are conjugate 
directions with respect to Q. Since from the current point x” we are moving in the 
direction p™ = ad", we observe that the DFP method can also be thought of as l 
conjugate direction method. Further if we take Sy = I , then it really becomes the conjugate 
gradient method. Here it should be noted that it is not necessarily true that starting from 


the same x e€ R”, with So = I, the DFP method and the conjugate gradient method will 
generate the same conjugate directions. 





Remark 9.8.2 It can be proved that m 


atrices Ax and By are real symmetric matrices 
of rank 1, where 


p) (pyr 
at (pT gh)’ 











and 


BL = Seq (qg) te, 


G TS 


T Ok T 2k — Dk ts called a rank one updation. 


KQ pÀ = pl) 9 <j < k. Therefor 
ee, unity eigen value for pect 
t as they are conjugate with re 






=~ & > da= 
> S N mi Ty f i ve 
p E basie 


= -_————=— oS 
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: : x 9.8.4 Since (DFP) method is also a conj 
wil 


io = So + Ap — Bo = | 6/13 9/13 






have the same properties, i.e. it has the 


Ih on property but will not have the 


í propert 

by incorporating the Powell's corrects P y of global i el in general. 
ryen ng t in atso be made globally 
nether and Reeves method for UMP’s, after sont proviem (9.1 5). Thus similar to 
je method replacing x by x and So by Sp, TEPS, once we get x, we restart 
pxamP 
flee) 
ution The given function is a positive 

has the form of problem (9.6) with 


4 6. <4 

xy i = 

(2) e=(0)-2=(4 i) 

step 1 Take x = (0,0)7, So = I and therefore 
A s=- ilo )=(> } 

aeons TEE G 0 


Step 2 Now we find ao. For this we have 
h(a) = f(x + aod) = f(—4a0,0), we have 
f(-Aao, 0) = 48a? + 16a + 6. Therefore 
h(@) = Min h(ao) gives ao = 1/6. 
ao> 
Step 3 We next obtain 


x) = x0) + aod) = 


le 9.8.1 Use the DFP method to minimi 


= 3x} — 4X12 + 2x5 + 4x1 +6, F(X1,X2) over (x1,x2) € R? where, 


starting with x = (0,0)" and Sọ = I. 


definite quadratic form and the given problem 


—2/3 
0 


pO = x0 — x) = 24 3 


0 4 
9 ri gi) i g% = ‘aa = A = ER 
1/6 0 


E on -6/13 
7 \-6/13 4/13 
37/78 at 


16/ A æ = 13/12 
, 1 Vee _ Further &1 

ext direction d™ is given by d0 = -Sig' E ere 
a O 1)) over & > 0. Therefore 


G y\? (È 
+t (Y104 URR N 
KIUL o aa 
a A] = 


P 

ATA. è rH 
“T) ig TR aN "q 
4 ae l Lie IA 






yar 
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g Ataf i 
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f the results stated earlier. For example. we can 


tric matrices of rank one and the matrjy eck 
() and g™ to get S2, then S, = ot" i 
and 


Here we can verify some 0 
that the matrices Ao and —Bo are symme 
itive definite. Further if we also evaluate p 


(p)? Qp = 0. 


9.9 Preconditioning 


finite matrix. Then we have seen that minimizing LT Oy 
finding the unique solution of the system Qx = b. In this 
erger [106]) the convergence rate actually depends 
: min are the largest and the smallest eigen values 


of Q. The number r is called the condition number of the matrix Q. The convergence 
is best when r is close to one and it is becomes slow as r increases. This is because for 
r=1, i.e. Amax = Amin, the contours of the objective function becomes circular. Therefore 
to accelerate the convergence of an algorithm we should try to modify the eigenvalue 
structure by transforming the matrix O suitably. 

Let the given system be Qx = b, where Q is an (n x n) positive definite matrix, 
x € R” and b € R”. The key idea of preconditioning is to change the variable x to 
v via a nonsingular matrix C, v = Cx. The quadratic form (5x7 Qx — b! x) then gets 
changed to Lor (C 1) QC D) — (CD o If we now use an algorithm (e.g. conjugate 
gradient method) to minimize this transformed quadratic form or equivalently solve the 
system ((C 71) T QC = C-1p, then the convergence rate will depend on the eigenvalue 
structure of (C=) LOC] rather than that of Q. Therefore in preconditioning our aim 1s 
to choose the matrix C such that eigen values of (C-1)TQC-! are almost equal. 

In practice, it is not necessary, to carry out the transformation v= Cx explicitly, 


because we can apply the algorithm to the problem 


Let Q be an (nxn) positive de 
bTx over x € R”, is equivalent to 
context, it is known that (e.g. Luenb 
on the ratio r = Amaz, where Amay and A 





yl g la 
Min z0 (C 1)TQC71)v) — (C 1b) v 
to get U and then invert the transformation to get x = Cd. In the case of the conjuga? 
gradient method, we do not need C explicitly but rather use the matrix M = ~ > 
which is symmetric and positive definite by construction. Such a method is ¢ 
preconditioned conjugate gradient method. We may refer to Nocedal and Wright 


for further details in this regard. 
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The l n’s . 
’ and 9.5 respectively. These methods form the os are discussed in Sections 9.4 


: of othe i 
in particular, the conjugate gradient method and the wah ag based methods, 
Sections 97 and 9.8 respectively. method, discussed in 


g 








m descent method ) 

. The steepest a uses the first order approximation of the function and 

it does not perform well when we are close io th I 

) ; e optimum. Newton’s method uses 
second order approximation of the function and it performs well even when we are 
close to the pemn. However, the convergence for Newton’s method is guaranteed 
only when the starting iterate is chosen close to the optimal solution. 

e M. R. Hestenes and E. Stiefel in 1952 originally gave the idea of the conjugate 
direction method which led to the development of the conjugate gradient method for 
the quadratic case. Later in 1964, R. Fletcher and C. Reeves extended the conjugate 
gradient method for nonlinear functions. 

¢ The idea of the DFP method is originally due to W. C. Davidon in 1959, which 
was simplified and reformulated by R. Fletcher and M. Powell in 1963. This method 

_ is also referred to as the variable metric method and falls under the class of quasi 
¢ The updation formula of the DFP method has led to several new updations of the 
current positive definite matrix, in particular the BFGS class given by C. Broyden, 

R. Fletcher, D. Goldfarb and D. Shanno. 

* Although we have discussed only the step length based methods, there are another 

_ cass of methods, developed in 70’s, called the trust region methods. Here the step 
ae i : j is given by x®® = xO +4® 
length a; is always taken as unity, so that the new iterate Is given Dy ae 

In order to ensure that the descent property holds, we may have to take several trial 











: Fs ect ors before finding the satisfactory d®. This involves solving a constrained sub 


Problem of the form ‘Min (=) d + id Hd, subject to |ld|lz <4 for some A’. 
We may r s i | et al [22], and Nocedal and 
We may refer to Dixon [49], Gill et al. [68], Bonnans ; 


tight [119] for further details. 
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at f is a unimodal min function. 





1. Sketch f(x) and hence justify th 





I -r : x 9 i ; 
9, Perform 4 iterations of the golden section rule to minimize f(x) over [0,4) a 
9.2 Consider the problem of minimizing p(x) over [-5,5] where | 
x . ee 
ox) = max(2—x,x=1,1- 5), reR. 9." 
1. Solve the above problem graphically. are 
2. Is @ a unimodal min function ? Give reasons. i 2x}: 
3. Perform 3 iterations of the golden section rule and hence give an approximate valye g.i 
of Xmin- 
9.3 Complete 3 more iterations of the following table where the function being mini- 
mized is |x|. i 
Ki: Zik Xuk| Xpk|\ Xak Esk Enk DR 
1| -5 | 2.64 |-2.08|-0.29 he 
pos 
(1-2) q(t 
9.4 Let O(x) = max| x, ,xeR. e 
2 dÊ 
1. Verify that ọ is a unimodal min function. 
2. Perform 3 iterations of the golden section rule to minimize ġ(x) over [-1,1]. Identify dÊ 
the interval I4 after 3 iterations which will contain the minimizing point. 
3. Perform 3 iterations of the Fibonacci search method to minimize (x) over ELin 4 iL 
and hence identify the interval I4. | B, 


= 


4. compare the ratio z 
1 





as obtained at (2) and (3) above. 


9.5 The Fibonacci search method is to be used to find within 10% the value of x in the 
interval [0,1] that maximizes the function f(x) = Min(x,2 - x*),x ER. 

| 1. How many iterations are required? .. eas 
2. Give results for the first 3 iterations only. e 






} 


s Consider the problem of minimization of f (x1, X2) = 4x? kex — 8x1 X2 over (x12) © 


ee 
ei 









escent method taking the starting point wit" 


ye reasons for your answer T 
(-1, 2) Í 
your 


ws Ew VE h 
rn ino: rt ” ae +h P =A f aS i pens e; . (0) a 
raent method starting with x = 

í E E SA 
e o S EOS ae r x yf 
i RS ma y NA MAA. y is E ns fo 
ang point? Give reaso 
vv d -a 1 
n 


E 
Pee bt 
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wen problem by the DEP methoc f 
golve the given prooiem oy the | F] method starting with x = (—1,2)". Do you take 
she same number of iterations as in the conjugate gradient method? Give reasons for i 
your answer, iy 


et ae 05) a 
9,7 Let d™) = (1,0)" and Q = l 6 | Find d® € R? (A® #0) such that d® and d® 


are conjugate directions with respect to Q. Hence find the minimum value of (x? A 3x2 F! 
gla = N = 12. 


Valte 
9,8 Consider the system 
Xi = X = —4 
Mini. 
ii 3X = Xo = 0. 
Taking d0) = (0), use the conjugate direction method to find its solution. 
9.9 Let vo), v®, and v® be linearly independent vectors in R? and Q be a (3 x 3) 
positive definite matrix. Let 
d0) = oy) 
NT Oy) 
i) = po — (2) gm 
| (AM)T Qd® 
1)\T Ay) (2))T Oy) 
ntify @ = po ECS ele | 
(d)TQd® (AJTO d® 
1,1), 1. Show that d™,d®,d®) constitutes a set of Q-conjugate directions g R°. . 
2. Verify your answer for (1) above for vo = (1,0, Ora Oe, oo) =e) 
S00 
andQ=|0 3 O 
3 OL 
t Hence or otherwise use conjugate directions method to solve the system Q x = b, 


where b = (0,1, =)". l 
3. Generalize the above result for R”. 


9.10 Use the conjugate gradient method to find a solution of the system 










oe aS = 8 
2 tN = 10. 
k e Ji 
You may take the starting point as x) = (0,0)". 


9.11 Solve 9.10 above by the DFP method. 


ae ry Re E 
LOTR IIL fen 
W J SAPS NA 


te C lirection method in the closed form to find the solution of 
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Axi = Kg * Yai 


s d = (1,0)! 


direction a 
You may taken one he given system and solve the same by Newton’. Metho 


Write the equivalent UMP for t 


9.13 Show that the eigenvectors corresponding to distinct eigenvalues of q real syy, 
metric matrix Q are conjugate with respect toQ. 

t and d® = (1,0)!. Find d) + 0 such that d® and d0 i 
3 


2 
9.14 Let Q = i 


conjugate with respect to Q. Is d™ unique? 
Hence or otherwise use the conjugate direction method to solve the system 


xı + 2X = 4 
ON ste oan 


9.15 Are the following statements true? Give reasons for your answer. 


1. Let (x) = Min(3x — 10, —5x +5) forO<x < 5. Then Q is a unimodal maz function, 

2. Let h = 1 and I, = 0.01. Then for the Fibonacci search method Xy1 = 0.382, and 
Xg1 = 0.618. 

3. Let f(x1,x2) = (xX — x* — 2x5) be maximized over R2 by Newton’s method. Then, 


irrespective of the starting point, the method will always give Xmax in exactly one 
iteration. 


4. Let l = 2 and I = 
method is 10. 


ð. Let a function of three variables be mi 


0.01. Then the value of n to be chosen for the Fibonacci search 


nna by the steepest descent method. Then 


ws 1 1 
dy, = (0, 4-4, and dk44 = (40, -4) can not be two consecutive directions of 
descent. 


6. Let f : [-5,20] > R be a unimodal min function with f(4) = 10 and f(6) = 20. Then 
the point Xmin lies in the interval |-5,4 HE 

7. a the Junction f(%1,%2) = (x2 + 2x9) be minimized by the steepest descent methot 
| wie ie direction of descent at the point (0,1) is (0,-1) 

ò. Lhe directions dq) = (1 9\T = ce j 

ae te, ame d2 = (3,4)" and d3 = (-1,1)" cannot be conjugate with 


F eS GIN p EP 
if j G N f 1 fry fA iy Í @} ; a 
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ol 
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men there cannot be more than n conjug 

3 + Dax, starting with (xf = ei 
“Minimizing point exactly after two ' 


Lee 
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9.36 Fill in the blanks. 


į. For minimizing the function (x¢ 

of descent at the point (0,1) is.......... 
9, For minimizing the function 6x? +x2 

where the “hata Ane a OE E 


9. If f(x) = X +ax* +bx has a local mazimum at x = 


+ 2x2) by the steepest descent method, the direction 


2 +4x1x2, the DFP method will give a point (X1, X2) 


Se (a —1 and a local minimum at x = 1, 
4. If J, = 10 and I, = 0.001, then for using the Fibonacci search method, the value of n 
MOWA an 


5. The directional derivative of JOA, X>) = 1 + X4Xp + Ms at (1,0) in the direction of 


Vlin ® eens 


6. For a function f : R” — R, let (Vf(x))'d > 0 for all directions d € R”, then 


VA 

7. The problem of minimizing f(x1,x2) = 6xı + Oia + i + 2x;x2 over R? is equiv- 
alent to solving a system, of linear equations Qx = b, where Q =............ 
Er NN 


8. Let the eigenvalue of a 3X3 real symmetric matric A be 0.4, 1.2, and 0.4. Let 
B= A? -2A + I. Then the condition number of Bis ..........-. | 
9. Let {ry}, rk = a, O < a < 1, be the given sequence. Then {rk} converges to zero with 


the order of convergence as .....------: 


10. The function f(x) = min(1—|x|, y1 — (x - 1)*),0<x < 1, isaunimodal............ functic 
but not a unimodal .......+++-- function. 


ti- 


i a 


Dad 
B Ned Í 
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algorithms in Nonlinear Programming 


10.1 Introduction 


In Chapter 7, we have studied Wolfe’s meth 
cramming problem, namely the quadratic pr 
we wish to proceed further and aim to deve 
programming problems. For this, we first consider the class of linearly constrained NLP’s 
and then study the class of those NLP’s whi 
There are several algorithms in the literature for solving NLP’s, each having its own 
merits and demerits. Since an exhaustive discussion of these algorithms in a text book 
format is neither possible nor even desira 
the choice of various algorithms to be dis 
For solving linearly constrained NLP’ 
cussion in this chapter. These are comm 
Rosen’s gradient projection method. Th 


S we have selected only two methods for dis- 
only known as Frank and Wolfe’s Method and 


€ main reason for choosing Frank and Wolfe’s 
method is that it is a simplex based 


method and has guaranteed convergence. Here 
the problem is solved by solving a sequence of linear programming problems which are 
constructed by obtaining the linear approximation of the objective function f(x) at the 


current feasible point. On the other hand, Rosen’s gradient projection method has a 
natural appeal because it is essentially a modification of the gradient based method 
Which we have already studied for solving the unconstrained optimization problems. 

A very popular method for solving general NLP’s is the sequential unconstrained 
minimization technique (SUMT) due to Fiacco ad Mc-Cormick [57]. As the name sug- 
ests, here we solve the given nonlinear programming problem by solving a sequence 
œ% unconstrained minimization problems which we have already studied. Traditionally, 
‘ere are two approaches for applying SUMT, one is called the penalty function method, 
| and the other is called the barrier function method, and we plan to discuss both of these 
M the present chapter. i . = 
= ertain optimization problems, e.g. the separable nonlinear programming ts sees 
Can 9e visualized as multistage decision problems. These problems can be solved effi- 

ciently by using the technique of dynamic programming due to R.E. Bellman. We present 
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A be an optimal solution of (10.2 
x (10.2). Then wr Nye (k 
considered, 1O ) S w(x . Thus there are two 


In case (i) we shall be proving that x) satisfies the KKT conditions for problem 
(10.1) whereas in case (ii) Ei shall show the existence of a direction d® so that if 
we move M the direction d’ from the current point x then the objective function 
improves. “i it must be noted that, because of the convexity of f, case (i) will result 
in giving x® as an optimal solution of the given problem (10.1). 

Let us Sn? the case (ii) first. We note that wa) < w(x®) does not necessarily 

) rk i | 
mean that f @ 1< f (x )), because f(x) = f(x) holds in a neighborhood of x), and x”) 
may not lie in this neighborhood. However, if we define 


xD = 1 — a) 4 oX® =x + a - x), 0<a<1 
then x**)) is feasible to problem (10.1) because x® and x” are feasible and IEN isa 
convex combination of x and x). Also, if we write d® = LV then wa) < wrx) 
means 
(dO) Vv f(x) < 0. 

From the above, we infer that the directional derivative of f at x in the direction 
d® is less than zero. T herefore d is a good direction to move ior seeking an improved 
value of f. Thus from x*) we should move in the direction of Eà ) — x). To find that 
how much to move, i.e. the value of the step size qg, we compute the minimum of h(ax) 


over (0, 1], i.e- 





= in h 
h(@x) m aa (ax) 


_ =A _ 8 E A e aa 
where h(a) = fe + ad), q® =x — x. Thus the new point x is given by 


xD = x + Oy d® =x + a(x — x). 











In the computation of Gr, We should note the main difference with respect to what 
1 j ing the unconstrained optimization problems. Here 


o a onmid over [0, 1] but for the unconstrained optimization problems it 
ing MHU ce the problem was unconstrained. Since (10.1) is a constrained, 
er pper bound on @& SO that the new point x*+) does not 

he above we do not find the largest such a; but rather 

„mains in the feasible region. 

point at this 


I fier 2 
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l i thod 
Stepwise Description of Frank and Wolfe’s Metho 


The stepwise description of the Frank and Wolfe’s method is now given below, 
es 


Step 1 Choose an initial starting point 
= , Set k = 0. E- 
aie i an V f(x) and construct the following linear programming problem 
Min w(x) = xi Vf (x) 
subject to 
Ax =D 
ve 0. 403 


—(k 
Let an optimal solution of (10.3) be x"), 


—(k k 
Step 3 Evaluate w(x”) and wy (x). If w(x ) = w(x! i); a ae get ue as an 
optimal solution of the given problem (10.1). Otherwise, i.e. if w(x") < W(x! D), then 


go to Step 4. 
Step 4 Obtain a, € (0,1] such that 


= Min h(a,)= Mi xP + a(x — xy), 
oe ae 


Define 
x) = xO 4G GM — x) 


and go to Step 2. 


We now illustrate Frank and Wolfe’s method with the help of Example 10.2.1 given 
below. 


Example 10.2.1 Use Frank and Wolfe’s method to find an optimal solution of 
2x7 + 2x1X2 + 2x2 — 4x1 — 6x2 
subject to y 





WE 2X5 <2. 
X1, X2 2 0. (10.4) 
semen We note that the objective function is a convex function (infact it is a strict 
ý vee) CAG Whe Constraints are linear so the given problem can be solv K 
EER a aan a - x) In gener 
10d we need a feasible point x*’. - 
L a RS . ~ ' 
aase-I of the simplex algorithm but ® y 
an plot the feasible region oe aps T 
ee . “ae a) ye = I 


OD ww 










Cc „o d SRS 
>| fa Yi xT ə 
< a> toe a 
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V(x) =| +22 -4 
2x4 + 4x5 —6 





first Iteration 


ts 
' (0) = (—1 —3)F ‘ 
follows LPP 1 — 3X2. Therefore we need to solve the 
Min W(X) = —XxX;- 3X> 
subject to 
Vi + 2x. < 2 


X1,X2 2 0. (10.5) 


Solving the above LPP graphically, we get x”) = (0,1), wo) = -3. Also wo(x) = -2 
Here, it may be remarked that we have solved the above LPP graphically because i 
involves only two variables. However, in general, we need to employ the simplex method 
to solve various LPP’s occurring in Frank and Wolfe’s method. 

As wo(x”) < wo(x™), x) is not a KKT point. Hence from x) we move in the 


. . 1 
= ~ e ’ — x) to get the new point x). For this we first obtain the function 
hlao) = f(x + ane) — x)) = f (2 39, 1 aS 


i 1— ao 1+ a0 za (EP55) (rey 

C E a a ae 

and then choose go € (0,1] such that h(a) = Min klao) for 0 < ap < 1. Here again, in 
optimization algorithm, e.g the 


general we may have to use a suitable one dimensional 
Fibonacci search method or the golden section rule. But for our example we may simply 


— NE 
1-&o 1+a 
s to get ao = 1. Therefore x0 = I 5 o 1T) = (0, 1)". 




















use ordinary calculation 


Second Iteration 


For the point xD) Vf (x) = (-2, —2)' and hence we have construct the LPP 
Min w(x) = —2%1 — 2X2 


subject to 










TEL ate 2x2 Š 2 
x1, X2 Z 0. 


f ONU IN es 
ing the above LPP we obtain 0) = (0)! with w@) = —4 and wih) = -2 


As m aD) < wi(x?), x also is not a KRT point i 
PEITA E SA SOET OS a” -x®) For this we find &1 b 
x) we move in the dir ection =A aie 


1 NeTeLOLle ILOLLL D Go 
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nay) = Fa + CEO = x0) = far, 1 ~ an), 


which gives @ = 1/6. Hence 
0 = x 4 a,  — 2%) = (1/3,5/6)'. 


Third Iteration 


For the point x® we obtain Vf( x”) = (-1, —2)' and hence we construct the Lpp 
Min w(x) = —xX1 — 2x2 
subject to 


xı + 2x2 < 2 
i, 30 & 0. 


An optimal solution of the above LPP can be taken as x”) = (0,1)! (Note that for this 
LPP any point on the line segment joining points (2,0)! and (0,1)! is optimal). Noy 
wa”) = -2 and wo(x)) = -2. As wP) = wo(x), x2 is KKT point which, becans 
of the convexity of f, becomes optimal for the given problem. 

Therefore x; = 1/3 and x2 = 5/6 is an optimal solution to the given problem (104). 


10.3 Mathematical Justification of Frank and Wolfe’s Method 


Frank and Wolfe’s method discussed in Section 10.2 can be justified mathematically 
egg prove two things. Firstly we should prove that whenever wa” )= wlx”) 
is a KKT pòint, and secondly we should prove that the method always converges. 


E shall prove the first result fully here but the second result we shall only state and 
reler to an appropriate text e.g Zangwill [173]. 


Theorem 10.3.1 Consider the linear 
timal solution of (10.3) and W(x) = 
= nonlinear programming problem (10.1). 


programming problem (10.3). Let X” be an op 
wz”). Then x® is a KKT point of the giv 


2rahlom (171) 9\ :. e 

‘ JILG l F 4 ao Mtr $ N = ’ 
ae J) 415 PIVOT) Z 
yt VOLIL DY 


ee : 
ae ipliel 
linear, there exist KKT multi 
4 yore = mes : 
itions holc e i 
GSI 7: 
e + 
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Min f(x) 
subject to, 
anx Sb; 1€ 
ayyx = b;, 1E 12, (10.6) 


where J = {1,2,...,m)}, l U h = I, and a; is the i" row of the coefficient matrix A. 

If problem (10.6) is feasible then we find a feasible solution x) by using the Phase} 
of the simplex method, and initiate the descent process of our algorithm from (0). Let 
x) be the current feasible point. 

We next identify those constraints in (10.6) which are active at x” , 1.e. they hold as 


equations at x), Let 
I(x) = {ie I: awx™ = bi} 


be the index set of active constraint at x. Clearly In c I(x). 
Now at the given feasible point x“), we wish to find a feasible direction qd which is 


usable. For this, we need the direction d“ to satisfy qe" y f(x) < 0 so that a movement 
in the direction d“ decreases the value of the ob jective function. Initially we consider 
directions satisfying and” = 0,i € I(x), so that all active constraints remain active at 
the new point xt) as well, where x*+)) = x(k) +a dO a, >0 being the step size. This 
requirement amounts to requiring that d® lie on the tangent space M, which is defined 
by the active constraint at x. In other words, we need the projection of the chosen 
direction on the subspace M of the active constraints at x“, to get the new point x, 
In Rosen’s gradient projection method, we take d™ as the projection of the negative 
gradient at x“), i.e. -V f(x), on the subspace M. 
par oy oo a corresponding projection matrix P and 
EET Ae ane q Me és e maar which is composed of the rows 
(we are assuming that there are no et oy f : ; ý Th SPU Fit : oly 
a alin pe ant rows in A). Let M and N respectively 
7 pace and the range space of the matrix Ag. Then M = {d® : Agd” = 0}, 
N= (A3 B,B € R1} and R” = MON. Since -g € R”, where gl) — V(f(x™)), we have 








TAOTE, T 
gh =d® + AT B. (10.7) 
> tA k J(k) k 0 ar C hence (EREA i rives i 
(10.8) 


< (10:9) 
e M fy 
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© = Py(-g) where 4 
K{ ) where g V f(x) an Pk is the projection matrix give 
d n by 


it r= |I- Al Fall 
i . k 
od =e Woo (gP)FdO = (O) + (OT — (OTO = -VT = 
PIP < 0, e e is a usable direction, provided d® + 0. Here we h = 
the fact that g +d) is perpendicular to d® because R” =M @ A we have used 
Next we consider the possibility that the projecti AUEN 
{0 = 0. In that case 0.7) gives projective negative gradient is zero, 1.e. 


(k) T 2 
abe oie: SA). (10.11) 


At this stage let us recall that A, is composed of rows corresponding to the active 
constraints at x. Therefore, if the components of 6x for the active constraints are non- 
negative, then (10.11) implies that KKT conditions are satisfied at x“ and therefore the 
process terminates. However, if at least one of the component of Bx (say Bix) is less than 
zero, then it is possible to move ‘n a new direction to get an improved point. This new 
direction is obtained by relaxing the inequality corresponding to that j for which Bij < 9, 
i.e. the inequality a(jx < b; and get the new matrix Az by deleting the row aj; from Aq. 
The new direction is then obtained by projecting the negative gradient — gh) onto the 


-Letd be this projection. then we can show that 


subspace determined by the rows of Aj 
_(k Ee 
®) = Aj Br = | +AT Px: 


k : 
(oa < 0 and moi } < 0. Here we have used the relations —8 q 


k —k —(k 
a” +0, (g®)Td <0, Az a =0 and Bix <0. 


= St =(k) 
Thus a is a direction of descent and it is also feasible, because and =0,1€ I(x), 


(i # j) and T <O. 

We next find a by 
x and then minimizing 
of finding a, but there could cert 


finding the length of the feasible segment of the line originate at 
f over that segment. This seems to be the most natural way 
ainly be more efficient ways of determining the same. 










Stepwise Description of the Algorithm 


; . 4 (0) (get k = 0). 
ble point ¥ (se ; ; 
Step 1 Start with a feas® © and hence obtain the projection matrix Px, given by 


Step 2 Identify, the et Ag 
"S (x) and take dO = B g“): 


K} 


Att a j 
pP = a a A An ic q 
Se K — p iL mw gael? G Noy = q á 
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k 
a= Min f(x +ad®): 0s a say"). 
Set tD = x + at a and return to Step 2. 
Step 5 If d® =0, find Be = -(AgAj)'Ay 8. | 
(a) If Bj; 2 0 for all 7 corresponding to active constraints, stop as x” bee omens tis 


point. 
(b) If some fy; < 0, then delete the j!" row from Ag to construct Az and return to 


2. In practice we choose the most negative component of pp to delete the row from A 


to construct the matrix Az. Further, in case Ag consists of only one row and that to, 


has to be deleted, we take dy = —¢ and proceed as usual. 
We now illustrate the working of the present algorithm. 


Example 10.4.1 Solve the following problem by Rosen’s gradient projection method 
Min (x1 — 3)? + (xq — 7)? 


subject to 
xi — 2X9 <0 
X1 + 2x2 < 12 
A + 6X%> < 24 
Ge OX = 
Solution Here f(x1,x2) = (x1 — 3)? + (x — 7)" and Vf(x) = (-6 + 2x,,-144 2x2)". Also 


oa applying the algorithm, we need to-write the given NLP in the form (10.6), and 
refore the last two constraints are written as —x, < 0 and =X? < —1 | 


First Iteration 
Let x = T ha 
tite tha o Tire gives -80 = -V F(x) = (4,127. At yO onlpéhe kegel 
: Puce Ag = (0 =]. Therefore (44AT)! = 1 and Po =!- 
| AT 1 = : S 


eee ee e oae to find a and aO : 
E EE ET E da, respectively. We recall that 


q i J). 


Ie faacihlial | 
IO E wre? 5. i O) e iao ] i 
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374 Numerical Optimization with Applications 
x) is a global min point o 
ion of the given N LP an 


f the given problem. Hence (x) = 3 


1 
d the optimal value is 49/4. 2* 9/2 


convex function, 
is an optimal solut 


10.5 The Penalty Function Method: Motivation 


To motivate the penalty function method, let us consider the following simple Problem 


in one variable 
Min x 
subject to 
1<x<4. (10.12) 


Here the function to be minimized is f(x) = x? and the feasible region is the interval 
[1,4], and hence obviously the minimizing point is x=1. Now if we define a function P(x) 


as 


Pix) = 0, if x is feasible (i.e. 1 < x < 4) 
= | +00, otherwise (i.e. x > 4 or x <1) 


and construct a new problem 


Min (x? + P(x)), (10.13) 


then problem (10.12) and (10.13) are equivalent. This is because in (10.13), as P(x) = +00 

for x infeasible, the minimization has to take place in the feasible region i e. [1,4] only 

Also problem (10.13) is an unconstrained minimization problem a ) 
The above discussion suggests that given a nonlinear enin problem 


Min f(x) 
subject to 


Loa O CE ,m), (10.14) 


we can always associ 
ys associate an equivalent unconstrained minimization problem 






as MP (fx) +P), (10.15) 










GA 
J 


is! 


Scanned by CamScanner 








Al ’ . 
gorithms in Nonlineay Programming 375 
the given NL 
lem UMP. But 





S heavily the decision maker for not being in 
y if he is in the feasible region. So it is natural 


zero penalty if the point x is feasible and assign positive penalty for the point x if it is 
not feasible, in such a manner that the penalty becomes more and more the farther we 
are away from the feasible region. One typical example for problem (10.12) could be 


BOJE Oye < x <4 
or equivalently 


P(x) = (Max(1 — x, 0)? + (Max(x — 4,0))° 


whose graph is shown in Fig KONE 


2 =(x-4 497 
=(1-x) y G | | | 
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4 rw i ms 

=» +00, (UMP)a > (UMP ) because aP (x) > P(x). The important point to note here is 
hat for each a, (UMP). is a smooth unconstrained minimization problem and therefore 
: an be solved by the standard unconstrained minimization techniques discussed earlier 
n Chapter 9. If x(a) is an optimal solution of (UMP)«, then we expect that as a > +09, 
(a) should converge to x(a) = 1, the minimizing point of problem (10.12). Infact that 
is going to happen as below given figure (Fig 10.4) suggests 


f(x) + a P(x) 





Ni 
we 102 
= 


Fig. 10.4. 


T Once we have understood the construction of (UMP)q for problem(10.12), there 
see ms to be no difficulty in translating everything for general NLP (10.14). Therefore 
takir g motivation from problems (10.12) and (10.16), we choose the following smooth 
penalty function for the nonlinear programming problem (10.17) 
; ; 
| 2 

P(x) = }_, ( Max(gi(2),0)) (10.17) 
i ii 
al ik onstruct the sequence of unconstrained optimization problems (UMP), as 






a - o 
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aA p er two cases T 
napi (1=x,0) = 0. This gives 1=x <0, io 
ari more than or equal to one, So this case js | 
s Max(1 = xX, 0) = (1 nam x). This Lives (1 


E 





x > 1. Then (10.20 
hot possible. 


=X} 2 0, Le. 


) gives x = 0 which 








x <S 1. Also for this case 
x= atl =X) =), 





ie 
P = 
ie a < i as a > (). 
Therefore there is no contradiction with the earlier condition x < 1. Thus 


Ma) = 





1 +@ 
which tends to 1 as a — +00. Hence the minimizing point of the given problem is 
tol. 


Fxample 10.6.2 Use the penalty function method to solve 


Min Nee 
subject to 
1<x<4. 


Solution Again, geometrically we obtain that the minimizing point is x = 1. Now the 
constraints are (1 — x) < 0 and (x — 4) <0, which give 


P(x) = (Max(1 — x, 0))* + (Max(x — 4,0))° 
therefore for a > 0, the unconstrained minimization problem (UMP)« is 
Min q(x,a) = x + aP(). 
xER 
a) w.r.t x to zero, We get 


Now equating the determinant of 4 (x, 


= TOL 
x — aMax(1 — x,0) + aMax(x - 4,0) = 0, (10.21) 


j sider 
er, he given problem. Again we con 
optimality for t ea Prey obtaining %(a) from 


be possible: Why?) 










ch becomes the basic equation of op "poa 

Yarious cases and discard all those which ar ese 

ose cases which are possible (here only li pid 2 0) Lee a ae te ane 
PARANA L CAL wes a Ra Tàr * . 1e — a cas 

(x — 4,0) = (x — 4) This nae Max(1 - %, 0) = 0. Therefore, for this ; 
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mei e ae he 
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Case(ii) Max(1 = x,0) = 0 and Max(x =- 4, ” = : These rper imply X>4 
x <4 Also for these conditions, equation (10.21) er Si €h 18 not consister 
with the implications x 2 t and x <4. Therefore, this case is also not possible 
Case(iii) Max(1 = x,0) = (1 = x). This implies (1 - x) 2 0 Le. x < 1, But ¢ 
@ - 4) < (1-4) = -3 < 0, and hence Max(x = 4,0) = 0. Therefore equation (10.21) 
gives 

x-a(l-x)=0, 
Le. a j 
x(a) = ——- <1 (a@a>U), 
ma) (1 + a) \“ 
which is consistent with the implication x < 1. Therefore this is the only possible Case 
and we see the minimizing point 


i a 
im ——= 1. 
a—+co (1 +a) 


X= 


Remark 10.6.1 Looking at Examples (10.6.1) and (10.6.2) above, we should not feel 
that penalty function method ts applicable only when the minimizing point is on the 
boundary of the feasible region. The Example (10.6.3) given below illustrates that the 


penatiy Junction method will be able to locate the 7999.9 “a ae 
À . MLUUNIUMUZIIN ont 
tmienor point. g pont, even of uÈ iS an 


Example 10.6.3 Use the penalty function method to solve 
Min 
subject to 
eee. I. 


Solution The constraints are (x-1) <0 and =a) sO. Eherefore 


P(x) = (Max(x — 1,0))? + (Max(—(1 + x), 0))2, 
| _ and problem (UMP), is 








y EE N y. pu t vy | " 
' Cay a = 
we,» Å p ne + aP(x 
4 Y s & \ J 
4 ‘ rss 
= 7 
p s 


Ea a i > - 


l+x),0)=0 x (10.22) 
lity is Max(x — 1,0) = 9 
< 1 while the second equ?" 
ae) x + a.0 — a.0 = 0, 16 
- Q `- aCe pes apm 
Bl a MS 
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a. ys verify that one of the other cases, ¢ 
T e let Max(x — 1,0) = (x = 1). But the “B Max(x ~ 1,0) 


ce Max(-1 - x,0) = 
E } ~1, ie. ~l-x< 
Blves x + a(x ~ 1) = 0, 


ple 10.6.4 Use the penalty function method 
pxam Aiegi Bie 4 va a2 oa to solve 


ze 
subject to 3 


2 ~ (X1 + X2)* 
XY +X <5 
x] = 0, X2 > Q. 

Solution To solve the given problem by the 


he given problem as penalty function method we have to express 
e 


— Min — 20x1 — 16x2 + 3x? + 2x? + 2 x 
9 142 
subject to 
Xi +%-5 <0 
=i S10) 
> 0: 


We observe here that the objective function to be minimized is a convex function and 
the constraints are linear. Now the penalty function P(x1, x2) is given by 

Pl, X2) = (Max(x1 + x2 — 5,0))? + (Max(—x1,0))? + (Max(—x2,0))? , 

and for a > 0 problem (UMP), is 


Min = g(X1,X2, 2) = f(X1,%2) + aP(m1, X2), 
(x1,X2)ER? 
where f(x1, x2) = —20x1 — 16x2 + 3x? + 2x5 + 2x1x2, and P(x, x2) = (Max(xı + 22-9, 0))* + 
Max(-x1,0))? + (Max(—x2,0))?. 
_ For a > 0, q(x1,x2) is a convex function of (x1, X2) and therefore to solve me: we 
fave to evaluate the partial derivatives of q(x1, X2, a) w.r.t xı and x2 and equate them 
i i to zero, This gives 


= 












~20 + 6x1 + 2x2 + 2aMax(xi + x2 — 5,0) - 2aMax(—%1,0) = 0 (10.23) 


= — „+ x — 5,0) — 2aMax(—*2, 0) =9, (10.24) 


ka Sars ms jä Pila a RIO . 
tho o x eT) orob Wide 
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terms, we have to consider all possibilities in equations ( 10.23) and ( 10.24) to ig 
the cases which are possible and then solve the resulting equations. This gives 


mai Vat Le 
ss 


p sa) = AT 

ww Ba +5 
Therefore the optimal solution of the given nonlinear programming problem jg CA 
where i %)), 


= <a iif 

x1 = dim x(a) E 4 

and 

x2 = lim x2(a) = 
a—+00 3 


Remark 10.6.2 The above example illustrates that for a general NLP there wil] 
n equations given by Vxq(x,a) = 0 to be solved and there will be m äg ie : 
aon = sa equations. Obviously in this scenario, it is almost impossible to a 
t a ai naa explicitly as a function of a and then take the limit a > +00 to 
that we do not fei ee n = 2 y ne ouen (NLP). Therefore it is imperative 
some suitable numerical A Hin ce alk fe ld licitly as a function of a, and think of 
MINE on inl + ce mp ementation. It is natural to think that any numerical im- 
ee. a, “quire numerical values of a, but then how to interpret the statement 
- 418 aspect of the penalty function method we discuss in the next section, 


Min F(x) 


subject to 









eG = 1,2... 1), (10.25) 


a fel a b 


a.” convex functions. 
X). In practice jon 
Practice we take the penalty funct!® 


k1 > ay and {ay} > +00. In pra” 
A "Į A “a 
BE Sis J.o 
=" AA T ik 
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3. 
Srinization problem (UMP)q, 
Min 


xER” 


» Construc 
uct the following unconstrained 


q(x, a1) = F(x) + P(x), 


i itable unconstrai =: 
solve it by a sutat strained minimizati 
‘at x) be an optimal solution of (UMP) tae 

Oy: a 


4, Construct the following unconstrained min; 
a minimization problem (LIM 
P Jars 


wd 


step 


que starting with the point 


Min 


xeR” 


AX, Qk) = f(x) + &k+1P(x). 


ints ye it by & suitable unconstrained minimizati l 
a ~" is the optimal solution of PUA... ax ee starting with the point x, 
a, aS obtained at the preceding step. Thus 
(UMP)a2 38 solved starting ey the point ¥™ which is obtained by solving (UMP) 
. . ‘ a 
Step 5- _— till re «x ) is Si to zero or equivalently f (x) is close to qa% a) 
i.e. PE )<EOr qe , Qk) — fe )) < € for a tolerance € > 0. , 


Remark 10.7.1 Although us have assumed from the very beginning that the given non- 
linear programming problem is a convex programming problem, the basic logic of penalty 
function method is valid even for nonconvex problems provided we have methods to 
solve those UMP’s which are not necessarily convex. Since most of standard methods 
for UMP’s do require convexity to give global optimal solution, it becomes imperative to 
assume that the nonlinear programming problem is a conver programming problem. But 
theoretically at least, penalty function method is valid for nonconver NLP’s as well. 


Example 10.7.1 Use the numerical implementation of the penalty function method to 


solve 
Min (xy — 2)* + (41 - 2x2)" 
subject to 
x? — x2 =0; 
starting with the point xo = (2, iy. 
_ x = 0 can be written as so 












: ; < 0 and -x$ + x2 < 0. 
Solution The constraint n : 
Therefore the penalty function 1s 
2 ea 
P(x L x2) = (Max(xj ~ X2, 0))* + (Max( ( 1 2), ) 
h = (=x 
a general for an equality constraint § 
hi foll VWS 3 $ “Ac = + as “ino ¢ x) = 0 as 
aaa aa, = 0-1, 22 — 


(x) = 0, the penalty function is P(x) = (g(x))?. 


<0 and —g(x) <0 )- 
g(x) $0 op. ay = 100 etc. and 


Then following the steps 








JLE Ý 
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gus E a) | PO) Taare 
0.0935 | 0.2766 1.8307 









ee taal 





| 4 | 100 | (0.9507,0.8875 
| 5 | 1000 (0.9461,0.8934 


KIY x 
As per our stopping criterion we stop here because TAE qx, &k) or equivalen 
a P(x®) is close to zero. Therefore an optimal solution of the given (NLP) can be tie 
as (X) = 0.9461, % = 0.8934) with the optimal value as 1.9405. 


ST V (4-a880,0.7608) ia 
a 1.4539, ma n ; 
: i 1 1687,0.7407) 0.5753 | 0.9661 Oa0e 3908 
$| 10 | (0.9906,0.8425) | 1.5208 1.7129 0.01926 1996 
) | 1.8917 | 1.9184 | 0.000267 | 0267 
) 


1.9405 1.9433 0.0000028 0028 


10.8 Mathematical Justification of the Penalty Function Method 


In this section we establish the convergence of the penalty function method. For this 
we first prove the following two lemmas. 


— 10.8.1. Let ¥” denote the optimal solution of (UMP) ,, i.e. g(x ay) 
Man q(x, a), where q(x, a) = f(x) + axP(x), ak >0. Then 

(i) GR, ax) < ga, Qk+1) 

(ii) PR®) > PRD) 

(iii) FQ) < fe) | 


Proof. 
(i) We have 
k+1) 
> fae) ii aP +D) 
where the first inequ si q(x , Ak), 


gow: ality follows b TE 
_Sllows because 3” is optimal to ite Ne > ai, while the second inequality 


_ Ma) Dy the definition of 7) ae 












è E} r D A 








we 7 
=F) ; F, AA 
á 1) \ vy - Ds 
fi LUJ A = yy EN ft ATE Po 
l EY Ae FL GS Ig 
f a, A Fs 
’ 1 m A N . 


O + apPr ED) 
; afe A J By re 
i “ já r 
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k+1) 
"a k+1 
ste Me PRN) < FRM) + Aky PRY 


E. 10.2 
Now adding (10.26) and (10.27) we get (10.27) 
i Sa i (aksi = a) PZD < (kr = 0) PR) 

m 

Peet) < Pe) | 


(ii) Again by the definition of x), 
k+1 
FE) + PE) > fE) + PEM), 


k+1 
But P@®) = P@“*), and therefore from the above inequality 


T fe) z fee), 


Lemma 10.8.2. Let x be an optimal solution of the given nonlinear programming prob- 
lem (10.25). Then for each k, 
= f = qa”, ax) = FR”). 

problem (10.25), it is certainly feasible. Hence by the defi- 


Proof. Since x is optimal to fea 
nition of penalty function P(x), we have P(x) = 9, which gives 


f (x) = f() Sr a,P(x). 








7 Ð is an optimal solution of (UMP a, and therefore 


f@) + PO) 2 f GP) + PE”), 


R es PE 
a 
= eae 
Dad j 


@ > (E) + PE) 


ad 






ati 
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Min 3 \ 
2 | 
| 


I Wit 


subject to 






f y heck graphically that the minimizing point is 7 = 1 tid 
poe. : (1 — x) < 0} denote the feas; sal | 
£ çz ER ( i e feasible re lon of i 
spit a function B: int S > R given by ih Or Ans given problem. We now 
B(x) = aE] 
(1-x)’ 
e “nt ©’, denotes the interior of the set S, intS = a e e For I 
ve note that (i) B(x) 2 0 for all x € int S, (ii) B(x) is differentiable in its domain and 
i B(x) > +% as x approaches the boundary of S. 
We next’ consider the following unconstrained minimization problem (UMP), for r > 0 





x -1 
MinC(x,1) = 5 + r( = -). (10.29) 
Looking at the above problem, the first reaction is about its notation; there is ap- 

parently a constraint, namely ‘x € int S’ and yet we are calling it an unconstrained 
ninimization problem. The reason for this stems from the property (iii) of B(x). As 

| Bx) > +00 whenever x approaches to boundary of S, it is obvious that starting from a 
point in the interior of the set S and following a descent strategy, the optimal situation 
of UMP), will automatically be in the interior of S, thereby making the constraint 
‘ve int S’ irrelevant. If we now plot the function C(x,r) for various values of r we get 


= | the curves as shown in Fig 10.9. 

From Fig 10.5, we feel intuitively that i ve 
(say x(r)) then x(r) should tend to the minımızın ie 
as we can verify for the above example by employing usual calculus; because GT = 


d dC > 0 for x(r) = 1- V2r. Thus the minimum 
2 


Pi ; to x = 1 (the actual fi 
problem (UMP), is attained for x(r) = jir whieh e A1 


a ieee as r — 0. yi 
apa zing point of the given optimization problem) | 









f we obtain the optimal solution of (UMP), 
g point x as r > 0. This is in fact true 


fives =1- yr and x =1+ V2r, an 


ae : f the Barrier Function M i 
nalytical Implementation © Hit 


A 
a 2 i ¢ J 
TLLUIC 
tU ~ not narai A 


e ie = 2 
ne nropiem Ve 
: ETRADE 


-i = 
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Fig. 10.5. 


with its feasible region S = {x € R” : gees OF 1 =) 1,2,--- ,m |b We assume that 
int S + @. A consequence of this assumption is that problem (10.30) can not have any 
equality constraint. 

As mentioned in the last section, the barrier function method works by establishing 
a barrier on the boundary of the feasible region that prevents a search procedure from 
leaving the feasible region. We now give a formal definition of a barrier function. 


Definition 10.10.1. (Barrier Function) A function B: int S > R 1s called a barrier 
function if 


(i) B(x) > 0 for all x € int S 
(tt) B(x) is differentiable 
(iii) B(x) > +00 as x approaches to boundary of S. 


Though there could be many 
most often is 





choices of barrier function, a typical barrier function used 


Ba) =~" : 


mye eC int S . 
far Sila) 









ù f p” i § 
oS Ce 
ADI Has a n cul 
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aly this problem can be sol Hi 
l risingly 4 ti UM ved by using unconstraj ys. HH 
pence the notation (UMP),. This is } ned minimization techniques 


E. int, using steepest descent method or so 
E k PUMPS. As B(x) — +œ for points x nea 


t0 i Sy i 
of (UMP)r is bound to be in int S and so the constraint x € int S need lution x(r) 
explicitly. 45k i; ed not be considered 
Once x(r) is known explicitly as function of r we obtain x = li 
’ tain oo ] = i 
solution of problem (10.30). lim x(r) as an optimal 


pxample 10.10.1 Use analytical implementation of the barrier function method to 
solve 


Min x2 
subject to 


MarS IL. 
Solution Here f(x) = x? and 91(x) = (x — 1) < 0 and go(x) = -(1 + x) < 0 are two 
constraints. Therefore we take 
—1 1 Ly =i 1 
@—-1) ea @-1) (+a)’ 
and introduce the following problem (UMP), for r > 0 


B(x) = 


4 
— — +n- 
—1<x<1 (x-1) (x+1) 


At this stage, we should note that for this problem, —1 < x < 1, is the constraint x € int S 
l which is never enforced because it follows automatically that the optimal solution of 
(UMP), will be in the interior of the set S. Therefore for all practical purposes, (UMP), 


Min C(x,r)= x? á 


is an unconstrained optimization problem. Now ae = 0 gives 


2r 
T ee l =0, 
*1a— ire De 


— 0. Therefore x(r) = 0 is the optimal solution 
given optimization 


7 


mi : — d2C —_— 
Le; x(r ) = 0. Also a > 0 for x(r) iene die 
5 tf. rr Z ae pees z . . . . 9 O 
otf (UPM), and lim x(r), i.e. x = 0 is the minimizing pol 

m To oe a i 





} aN 
DOF lo: > 
af = VLit 
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Step 1 Choose a suitable barrier function B(x). In practice we take 


“1 
B(x) = “hea 


l 


Step 2. Choose a decreasing sequence of positive real numbers which tends to zero, ig 
a sequence {r}; such that for all k, rk > 0, fk+1 < rk and {rk} > 0 as ķ > 409, 5 
practice we take r = 10, 7% = 1, 73 = 0.1, 74 = .01 ete. | 
Step 3. Choose a starting point x € int S and construct the unconstrained minimiza, 
tion problem (UMP), 

Min C(x, H) = f (x) FFB) 

xeR” 


Next, solve (UMP), by a suitable unconstrained minimization technique starting with 
the point x, Let x” be an optimal solution of (UMP),,. Set k = 1 (here we may note 
one important change in (UMP),,. We write x € R” rather than x € the interior of the 
set S, i.e. we take the problem as unconstrained munmization problem. As has been 
earlier explained, this does not change the original problem as x) will automatically be 
in int S due to the presence of the barrier function B(x)). 
Step 4. Construct the following unconstrained minimization problem (UMP), 

Min C(x, fk+1) = f(x) i rk+1B(x) 7 
i solve (UMP), by a suitable unconstrained minimization technique starting with 
x) , where x) ig the solution of (UMP),, as obtained at the preceding step. : 
Step 5. tinap till nB”) is close to zero or equivalently f a”) ) is close to ce" rtk); 
i.e. rB”) <e or CE” 7) Deef a tolerance e > 0. 
























i gorithms 
in Nonlinear Progra | 
i -. 


lem, we 


e we start solvin 
s iiA ving this 
Fj . yet he d if the co prol 
—_—* nstr i ma 
AS sum ptior here is that say piven in the torte a we cannot use the 
a feasible, because the starti + p. Also we can 1 ~ X2 = 0 because one 
+ ng point y0) Mieter va take x — (1,1)7 even 
a. . e in the interior of the 
ar CNOOSE ~ set 
ow ch the barrier function B(x) as 
(xi = x2) 








Ins 40, 72 = -1, r3 = OL, 74 = .001 etc. Next we 

B ka A consider problem (UMP) as 
je un (x, rı) = (x4 es Phi iby (x Li ag yar 
= 1 = 2X2)" 4 107 > 
ee x5 — x2) 

lve the same by a suitable unconstrai beatae 

W = (0,1)!. This gives an opti rained minimization technique starting with the 
“construct problem (UMP), imal solution of (UMP),, as x® = (.7079, 15318)’. 
— )r. and solve the same starting with xD), We summa- 


: a os ults in the following table 


FER) | CGR ny | BE) [Boe 


18.0388 9.7105 


EL 











(0.7079,1.5315) 
(0.8282,1.1098) 
(0.8989,1.9638) 
(0.9294,0.9162) 
| 0.00: (0.9403,0.9011) 
| 0.0001 | (0.9438,0.8966) 








| ; oJ : C g stopping criterion we stop here because f (x!) ~ C(x% rg) or equivalently 
) js S T Therefore an optimal solution to of the given problem 
6) with the optimal value as 1.9645. 


x 









Mat! a ; ratical Justification of the Barrier Function Method 
= ce results for the barrier function method. We do not 
Section 10 8 for the 


hose presented in 


we state convergen 
its here as they follow similar to t 


E thos | i 
a Tot x*) denote the optimal solution of (UMP) %© Ce”, 1) 
eee) = f+ 720)” 0. Ties 


= i 
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Theorem 10.121 Let (x) be a sequence generated by the barrier function met 
Then any limit point of the sequence is an optimal solution of the given nonlin ear pm 


gramming problem. 


10.13 Starting Point for the Barrier Function Method 


We know that to employ the barrier function method we have to start from an init k 
point x® € int S. If the problem is small then there is a possibility that such a Point 
can be obtained by some trial and error method. However if there are many constraints 
(i.e. m is large) and/or many variables (i.e. n is large) then such an approach is neither 
possible nor desirable. Therefore, in the following we present a simple but logical Way 
of finding a point x in the interior of the feasible region. 

Let S={xER”": g(x) <0, i=1,2,--- ,m} be the feasible region. Then our aim js 
to find x € int S, where 


AOE ER SOO i= 1,2,--- 2m). 


Let us follow the following steps 

Step 1. Choose ĉ € R” arbitrary. 

Step 2. Let J(2) = fie]: gi(x) < 0}, where I = {1,2,---,m}. Thus J(£) is the index set 
of those constraints which are not active at ĉ, i.e. J(2) = I-I (x), I(£) being the index set 
of active constraints at £. 


oe If J(£) = I, stop as £ € int 5, as gi() < 0 for all i € I. However if J(%) c I, go to 
ep 4. 


Step 4. Choose J £ F(2), i.e. choose J for which gj 


V) > 
problem (P;) (£) > 0 and construct the following 


tat gj) 


subject to 


Si(x) <0, ie (â). 





"TNE eR alla problem we have a point, namely f, such that £ € int Si 
eS Ne ini Sate region of problem (P j). Therefore problem (Pj) can be solved by 


T eo by taking the Starting solution as £. Let x® be the 
d SI pa Ai 2 YU, Ao top as Re y 


aS Mt 5 = $, otherwise go to Step 2. 






$t wx j PL [ee vs vÉ Sar 7 
} Je 5H). Fa 7 t17 i? a oe! E 3 
mae Darrier Munction m “tho 

~ 7. % A, j h Res Pà n 
i ya + f T \ Tr 
Di (AEA. 













: most m iterations we shall either be 
ete tk; M A A . 
such that x € int S 
i 
i ‘ 


Jon 
. iy 


able 


—- 
; = 






= — 
J = 4 
= = * 
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| Algorithms à 
1 | 10.1 4 and Their Global Sin Nonlinear Pr 


Cor à 

et S recall that an algorithm İS § vergence 

poi i, the algorithm is guarant 

point non set. The Meaning Of ‘ante Seq 

may change meme One clasy-of Problems + Arbitrary start 
MS to ; 


pitrary starting point May mean y0) és Other Class of 


TS 
on € globalh cony 
‚O Senera T ergeni if fo 
EEG ‘a 


m 

“8 point may Se mean a point ¥ such 
: An A, feasible point x0) 
° . thi è é ent and even . 

establishing this property is not a Simple task, Th 


om analysis and li 
results fr ysis and linear algebra. Also the D 
algorithm dependent. This makes the analyse; 
: therefore above the reach of most of th a 
in 1969 gave a unified © common readers. 
[173] in g mned theory of global conv 
the global convergence property of most of the 
give a very brief account of this important dey 

Let the given optimization E 
is the Er N and f i eae Beto minimize f(x) over x € S, where S c R” 
ag d T S tne objective function. By an algorithm for solving this 
optimiza ion em we generally mean an werative process that (i) starts from a 
given point x’, (ii) generates the current point x“ according to some prescribed set of 
instructions and (iii) stops as per specified stopping rule. 

Mathematically, we can view an algorithm as a point to set map A, which for a given 
point 4) generates a new point x) e A(x). This point to set map is called the 
algorithmic map which generates the points x, x), XO, gD) where x Z 
A(x) for each k. Further obtaining x) from x) by using the given map A is called 
an iteration. 

| consider the probl 

As sample let us L(+ D], for x2 Land Alr) = [A(x+1),1], for x< 
Ai(t) = 3(x +1) and Aa(x) = [1,30 + Ve or a Then with ©) = 4, the map At 
or ee A Teta: } which converges to the solution x = 1. 
generates the sequence {4,2-5,1.75,1.379; 


- int (9) E Si r 

Infact it conver ges tox=1 with any a ut ji a closed interval, and O oe 
= i of any pom ‘cular with x*’ = 3, 

] mA the map A2, the wae hie oi paint ee). ii a a Obviously many 

| s MANY i n ; }is a possible Sapte: at the closed interval A(x) 

— Ps _ “uence Me pay dedy 2S *s int x 


} ting pol 
other sequences are possible for the starting 


em of minimizing x2 subject to x > 1. Also let 







> Mma S a 
Ż be teg ô a- Š F - 
p a rI ; n i 
£ _ x > ma = 
a = 
ka (i 


= : 
Gar 


i 
S 
E & s 
a 
i pa 
Esa 
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p). 


is said to be closed at X if 


x EX, x) > x,y 


394 Numerical Op be two nonempty closed sets 


t map and x E€ X. Then the 


in R? 
Map A 


(k) c A(x), y” >y>ə>yeE A(x). | 
Zc X. the map A is closed at each point wn Z then we say that A is closed on 7 
If for CA, ne 


p £ 


[3+ 1x, 1+ 5x], x22 
A3(x) = 5 (x a i). x<2 
is not closed at x = 2, whereas A; and Az are closed everywhere. 


Definition 10.14.2 (Descent Function).Let Q C X be the solution set of the given 
optimization problem. Then a function a: X > X is called a descent function if a(y) < 
a(x) for x ¢ O and y E€ A(x). 


Theorem 10.14.1 (Zangwill’s Global Convergence Theorem). 

Let X CR", (X + ọ), be closed and Q C X, (Q + d) be the solution set. Let A: X3 
X be the given algorithmic (point to set) map which is closed on the complement of Q. 
Given x € X, let the algorithm generate a sequence {x x™) xf) 4 xD) e A), | 
which is contained in a closed and bounded subset of X. Also let there exists a descent | 
function a. Then either the algorithm stops in a finite number of steps to a point x in 










Q or it generates a sequence {x} such that every accumulation point of this sequence 


18 a point in Q. 


Remark 10.14.1 Most algorithms discussed in Chap 
this chapter, are of type A = 
M finds the optimal step sized 


ters 9 and also those discussed in 


ID, where the map D finds the direction d™ and the map 
a for a given d®. 








Ia s 
ae i ae : 


ven Ag has only one ro, ak. the steepest descent direction for the cas? 
se tt red to be deleted, 

pact and Bazaraa and Shetty [11] 0" 

jence ‘heorem. These texts also p row 

r UMP?’s and NLP’s discussed her’ 
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o now present another powerful opti 
Í = å imi 
957. This technique is known as dm miz 
è R { amic 
ose optimization problems wh; Progr 3 
Which can J amming, a 


que dey 
eloped by R. E. Bellman in 


th 
e shall restrict our discussi be visual; 
on to determi ized as 


pasic characteristics are as follows 

tog a weg number of states steers N /N) in the problem 

Gi t each stage 7, we have to ciated with it l 

choices, which may be infinite. ke a y ciston d(i) from amon 

the system. ecision d(i) depends et “stat ladies: 
€ current state S; of 


(iv) The ay ot taking a decision d(1) at stage ex 

t stage (i + Serisi 
that at stag ( 1), the system is at a new state c i hange tha State of the system so 
(v) At stage 1 of the system, a return function R orresponding to the (i+ 1) stage 
the decision d(7). on R(i) is prescribed which depends upon 
(vi) The sequence of decisions d(i) taken at every stage, i 
policy. Se, 1.€. {d(1),. . ,a(N)}, is called a 
(vii) There is an overall return function R which i 

which is a function of (R(1 

3 | oy TERN. 

objective 1s Le find that policy which maximizes/minimizes the po a we 
R. Such a policy, if it exists, is called an optimal policy. | ipa 


- - . . N 
The technique of dynamic programming 1s best suited when R = y RO or 
i=l 


N 
R= I] R(i). 


i=l 
An optimization problem which ca 


principle, always be solved by dynamic program 
objective and constraint functions are not require 
applicability of dynamic programming. 


above framework can, in 


theoretically at least, the 
able for the 


n be visualized in the 
ming. Also, 
d to be convex or differenti 


An Illustrative Example 


am 


We now take an illustrative ex ale 
methodology of dynamic programming: The examp 


of the general class of problems commonly kn 9 nodes and arcs 


E Let us consider the network consisting S nected by an 

the distances between those nodes W®! ney 

a hortest distance between node 1 and node 
o 


accomna 
eComMmpan 
ST t 


tables. 


LW 
—_ 
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If we attempt to find the shortest path by trial and error or b 
we immediately realize that none o 


problem, then we can think of 
between node 1 and node 10. 


Looking at the network, we observe that there are four stages at which the decision 
is to be taken. Initially we are occy 


the first stage. At this Stage, a deci 
or node 4. The corresponding rety 













Programming methodology, we have to make 
Bellman’s optimality principle. This principle 


Eve; y ti he initial] stnto oe if par: 

i teehee STALE ana Es Oe ee . t 

è ely + MECTSIONn. are 72... ie = oie 

ivi ay en yew S203 are, the remaining decisions pe 

oy Ue resulting from the first decision. 

I) nant ~ n ae <S = À pew = j 

~ ~ nere. Phere is nothing special 

ciple holds for any state 4? 

= 


= 


j -= 

í ” F DBA tA r 4 
j ALL JES LIV Dath ari aa 
5 s t Fi iss j T -79a r 












path of : 
principle 


Obvious! 

Now | 
at node 1 
node j ( 
dij, then 
the path 
of this a 


has to bi 
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B « 


decision taken there. Also this 
aly al way of decision making, 


et we apply the Bellman’s optim 
sob em, We shall get a recurrence relati 
vill 
imply ” 
condition: ni 
Let us NOW gO back to our illustrative example and recall +} im i 
th of shortest distance between node 1 and node 10. T Sent aim is to find the 
winciple let us define - 40 make use of the optimality 


Principle Probably 
y reasserte 
asserts the most natural and 


ality Principle to the give 


‘a , on; also 
pe an initial (or terminal) condition Sot the det Bellman’s b 
em itself, M 


eed to solve the resulting Bellman’s equation under t} 
nder the 


n multistage decision 
quation. Further, there 
athematically, then we 
given initial/terminal 


fi = shortest distance from node 7 to node 10 CSi 10) 


obviously, fio = 0 and our aim is to find the value of fi: 
= let us o a the cur rent Staite of the system is node i (i.e. we are currently 
|  „tnode i corresp & to some intermediate stage). If we decide to go to any admissible 
node j (i.e. any node j which is joined by an arc with node i) and cover a distance of 
lij then the optimality principle asserts that from node j to node 10 we should go via 
the path of shortest distance covering the distance of f; units. Thus the total distance 


n, . . . . e o e 
ue of this action is (dj; + fj). Since the choice of j is arbitrary, if we find Min (di; + fj), 1 
j 
n has to be the shortest distance from node 7 to node 10. Therefore, we get 
h 
. = Min (dj; + fi) (10.32) 
7 fi {j admissible} IEA 


Since fio = 0, using the recurrence relation (10.34), we can obtain fo and then fa, frr- 


3 

S 

y ete., getting fı finally. Thus we have 
fo = Min (d910 A fio) = Min (4 =p 0) =! 
' _ 
fs = Min (dg 10 T fio) = Min (3 + 0) = 3 


fz = Min (d78 + fa, 479 + fo) 
= Min (3 +4, 343) =6 


fe = Min (dog + for 469 + fs) 
a = Min (6+3, 3+4)=7 


— Min (dsp + fer 59 * fo) 


5 = 
f — Min (1 +3, 4t = 
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fa = Min (3,5 + fs, 43,6 + for ni! + f7) 
= Min (3 + 4, 2+7, 4+6)= 


fo = Min (d25 + fs, 42,6 + fo, do7 + f7) | 
= Min (7 + 4, 4+7,6+6)=11 
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fi = Mini, + fo, 413 + fa, dia + fa) 
= Min (2 + 11, 4+7, 3+8)= 11. 
een node 1 and node 10 is 11 units of distance 


Therefore the shortest distance betw ya . ; 
To obtain the actual path, we note that the minimum value 11 in fi is attained for no ak 


3 and node 4. Therefore, from node 1 we can g0 to node 3 or node 4. To be sp ecific 
suppose we go to node 3. But then the optimality principle tells that from node 3 i 
node 10 we should go via the path of shortest distance, 1.e. we should look for the value 
of f3. There the minimum is attained for node 5, and so from node 3 we should £0 to 
node 5. Continuing in this manner, we get an optimal path as | 


1 — 3 — 5 — 8 — 10. 
We can also check that the following paths are also optimal 
1 — 4 — 5 — 8 — 10, 


and 











1 — 4 — 6 — 9 — 10. 


These paths are obtained if we take the other alternative optimal solution at node | 
1, and decide to go to node 4. | 


Remark 10.15.1 We note that the optimal decision at every stage need not give an 


ons ee ae ee path obtained by taking optimal decision at every stage | 
? —_—_—_ — ; i | 
can not be optimal. eech has a total of 12 units of distance and therefore | 








Remark 10.15. 

el eid a P fi as the shortest distance from node 1 to node 1, we 

(tial Fins a 2 y ae rei the initial condition as fi = 0. We then aim to 
an obtaine e ; pa 

two possible approaches of Ennion valuating: for fo ay ote. recursively. Thus there are 


forward recursion and the inai P ic programming methodology, namely, thé 
r ; ae 

popular because of its convenience and Beate oes Ca 
ntation. 


nake a decision a input 
: a ~<CIslon as a stage, and our inp 


ome | Ontam ed 
- wie%mM 19/2 
“=I, We nea 
TAAL Y {oS ga ic € 
saci Dani 
Pn, 
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r node 
pecific, Sn = Ra(Sn, dn) 
le 3 to 
Ă Value Here 
S0 to 7 = stage number 

Sn = input state 

dn = decision 

on = output state 

&n = return function = RAGAN 

For an N-stage decision problem, we have 
dz di 
node | 
S S z 
aai s 2. : 1 S1 
ve an 
stage 
efore 
82 51 
i, we 
m to 
> are mith Sin =5; (i=1,...,(N—1)),S2 = $1. 
the | tiall 

) ey mar timization problems sequentially, 
more | dynamic programming, we solve multivar Scene total return calculated 


| one stage 5, dn) denote t 
alas at a time. Let fn(Sn,4n 

af f Over 7 g f n Se Further, let 

o | total Sn. AS a particular 


fa (Sn) denote the optimal n-stage 
were ee 5 ! stages, given a particular state n 

cist EESE 
: n-stage total return 


value of Sn might give rise 
fx (Sn) is attained for a 








to » Teturn for a particular input state k 
~ many possible decisions dn, the optim 


ar ] . 


TS Gataia la 
ee. 4 7 








; a B af í s 
TIS = | T aA K 
ag ha z P — 
a os ee 
IAE a 


"m 
b: 


O ee = 
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decisions dr Therefore for the minimization aa 
ecisi 


m : 
= —_—— e Te = FE > eee x 
(im 
9 


decision d* amongst all possible wit 
O N 
R= >. R;, we have 
* (Sy) = Min (Rn(Sn, ae or ot; Rı(S1, d1)). 
fa n Any PAZ 1 ’ | 
the optimization one stage return 1s found by searching over all possible decisio | 
Now the op 


j ; iable. Hence, 
variables corresponding to a particular state variable. H 


fy (Si) = Ne (Ry (S1, d1)). 


Here we note that the range of d4 is determined by S;. But S1 is determined by what 
has happened in the previous stage which in turn gives 


fr (S2) = Min (Ro(S2, d2) + f (S1)). 
2 


N 
Therefore for a general N-stage problem, with R = 2. Ri, we have 
i=1 | 


fx (Sw) = Min (Rw(Sw,dy) + fi_,(Sn-1)). | 
N 

oe > a Bellman’s equation, we notice that we can obtain fx (Swn) for the given | 

Input state Sy, provided we knew Joy (Sw) Comtinnine 041: * 

which can be evaluated easily, aes 06 m this manner we reach fi Cy 


t 
$ 
t 






read be : R 
let the weight of one ee aL a in Chapter 1. Let there be N items. Also, 
= N) her, let ~“ Wi and the value of on it of the i” item 
, let th ; e unit o 
(integral) or aha ps Capacity of the cargo be N tons. The 
ximum. T is ] ss h item to be loaded so that the total value 


-~ 8 Optimization problem 


= 





i = 


See 
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N Wia 
lax vix: 1} 


subject to : 


N 
>: WiX; < W 


i=] 
a X; > 0 and integer (i= Ayee N). 


te the optimal val f 
Let NCW) deno 2 ue of the above problem. in vi 
on as outlined earlier, we have m. Then, in view of our 














discussi 
at | fW) = Max (vix) 
W 
1 1 = 0/1,-.,|— 
. w 
and 
fN(W) = Max w (onan 15 fu-i(W — wyxy)). (10.34) 
xn = 0,1,...,| — 
WN 
We now give actual calculations for W = 8, N = 3, vı = 4, v = 10, v3 = 6, W = 
5, w = 8 and w3 = 3. In terms of our notation of fn(W), we need to compute f3(8). 
But from (10.34), we observe that the evaluation of f3(8) will need the knowledge of 
f(W - w3x3), which is not known explicitly until x3 is known explicitly. Therefore it 
makes sense to evaluate f2 at various grid points, say, W = 0, | Ae 8. Similar calculations 
will also be required for fı. Therefore we need to have the below given table 
en 
1) w> eae. 2 13 eee Ts 
4 4 4 4 
P csi Chae 1 
o- 
: is attained. As an 


E j value filW] 
’ ma Gre x;[W] is the value of x1 for which er rene 


Ee al (v1x1). 
R aT 0,1,-- "iw =. 


A g 4 =* t 
e ” 
F A S . M. f 
bel L: ; a: — rh 
- 7 Bo ag = S | 
7 ~ =< pia t bog > 
ry . € DESET „5 a < ~- r 
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vx + fi(W — w2x2)), 
f(W) = Max y ( 


—_ 


xo =0,1,.++, w? 


only value as x2 = 0. Therefore fr(6) = (0 y iy 
4 


xə can take a 
2 > = 0. In a similar manner 


hich is attained for x 


(v3x3 + fo(W — wsx3)). 


and w = 8; for W = 6, 
fi(8 - (8 x 0)) = Al8) = 4 w 


f(W) = Max 


en), Ly.» « 


———_ 


7 W3 


Since w3 = 3, for W = 8, the possible values of x3 are 0, 1, and 2. Therefore 


fa(8) = max ((6 x0) + fo(8), (6 x 1) + f2(5), (6 x 2) + f2(2)) 
= max(10, 6 + 4, 12 + 0) = 12. 


Therefore the optimal value of the cargo is f3(8) which is equal to 12. Also the op- 
timal value is attained for x3 = 2. Now to determine x2, we note that after loading 2 
units of 3 item, we have already consumed 6 units of weight. So, now the remaining 
capacity, namely 8 — 6 = 2 tons must be loaded by remaining two items in an optimal 
manner. Hence we should look the value of f2(2) which is equal to zero and is attained 
for x2 = 0. Next we need to check f,(2) which gives x; = 0. Therefore the optimal solu- 
tion is (% = 0, %) =0, x3 = 2) and the optimal value is 12. 


The ‘Curse of Dimensionality’ in Dynamic Programming 


In th i s 
e cargo loading problem, the weight constraint ». Wixi < W, gives rise to one 


state variable, namel i . = 

y the weight i=1 

; N Se eCity ofte cargo. Suppose we also have a volume 
constraint of the form D qix 
1 


=1 


, < Q, where q; 
and Q is the total volum 


is the volume per unit of the i item 





0. Then along with W, we have another 
Q) similar to fn(W). The solution of the 
ear ch (for the continuous case) or grid 
nic of (W, Q). Thus each constraint linkiné 
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Borithms in Nonlinear P 


x ‘ $ rogra 
gti les So, here the number of constraints deter: "Og 403 
constraint gives rise to a state vari rmines the di 

of © variable whic} 

grerefore if the number of constraints is large anny Sk 
i laren : n 3 ay We 

ya blem has large dimension) and therefore | 

* eig n< yY n ; ar TQ 
D) 4 and related optimization. Since, in order to comput pigs: 
etn} l je 
problem, we need to retain an optimal solution to e given 
r large dimensiona S T 
stage, fo * i | problems the amount of informati ee a 
puted an stored) would be astronomical! This dramat: ation required (both to 
COM eee i IS dramatic increase j 

£ computation and storage required to solve e in the amount 


| | a given optimizati 
the curse of dimensionality by Bellman [14]. Belma sa oe ic e 
S ' ave 


l a procedure that can be used to reduce the curse of dimensionality by th 
troduction of Lagrange multipliers into the problem. Some more recent ee aA i 
approximate dynamic programming’ are presented in Bertsekas [19] prt 


mension of the problem 


e several possible 


have large nur : values, 


ber of state variables 
r of function evaluations 
an optimal solution of the i 
very state variable 


10.16 Summary and Additional Notes 


op- 
& 2 » Though there are many algorithms for solving NLP’s, we have discussed only some 
ae chosen few in our presentation here. For linearly constrained NLP’s, the algorithms 
nal discussed are Frank and Wolfe’s method and Rosen’s gradient projection method 
ned . which are presented in Sections 10.2 and 10.4 respectively. 
lu- » For solving general NLP’s, which may have nonlinear constraints, we have concen- 
trated only on the penalty function method and the barrier function method. These 
methods solve a given NLP by solving a sequence of unconstrained optimization 
problems, and therefore appropriately called sequential unconstrained minimization 
techniques (SUMT). | 
¢ Theoretically, SUMT is capable of solving even see ae es S Ren Lae 
j int of resulting unconstrained minimiZ 
me methods to find the globalbum ie in th nstrained min point of a 
lems. Though, in general, it is not easy to obtain the unco ee aly: 
me nonconvex function, we shall have opportunity to discuss some appropttā 
tionary methods for the same ee a | n method while Section 
si * Section 10.5 discuss the motivation ee a ae implementation of 
; ical an ; 
r” 10.6 and 10.7 respectively present the EM a method is detailed 
be the same. The mathematical justification O e p 
; in Section 10.8. method while its analytical 


| E ‘or function ; 
* Section 10.9 gives a motivation for aioe n Sections 10.10 and 10.11 respec 


are presented i 








r function approach have been 


ot the barrie ‘ed the barrier 
CE, o Ra IOT y T ty 7 i a roach and 2 studied t e . 


technique): 
ce d these methods 
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d 


j intro 
- Fiacco and McCormick in 1968 sn eHh eee 
function approach for solving NLP s 


sent thod for solyi 
are present. — d an exact penalty function me Solving NLp 
Fletcher [58] in 1970 develope ization yields a solution to the origina] Problen | 


-here a single unconstrained minim | | i | 
hae Sda ects for solving NLP’s (which rhea ae wie here) ; 
Mootendijitt feasible direction method, Wolfe’s reduced gradient method, Zangyyi) | 


convex-simplex method. Some appropriate r siete A nee iees am e the texts 
by Avriel [6], Simmons [143], Bazaraa and Shetty ; 8 » Zangvil 
o |89]. ; ; 

w ppe a a of penalty function method is that with large values of 
the penalty parameter a, the Hessian a of the penalized function q(x, a) becomes 
ill-conditioned, (i.e. the ratio of the maximum eigenvalue to the minimum eigenvalue 
of the matrix is fairly large), leading to large computational errors. To reduce the 
problem of ill conditioning in the penalty function method, M. R. Hestenes in 1969 
and also M. J. D. Powell in 1969, proposed the augmented Lagrangian method to Solve 
equality constrained NLP’s. Later, in 1973, R. T. Rockafellar gave an extension of — 
this method to the inequality constraints. For the following constrained optimization — 
problem 


series of papers in mid and late Sixtieg 
UMT [57]’ is now a classic on this Aa 
duced the mixed penalty-barrie, is Jeg 


ality and inequality Const 
i 


3 
B 
j 
fi 
| 


~— 


Min f(x) 
subject to 
8i(x) < 0, (i =1,...,m), | 
hi(x) =0, (j =1,...p), | 
the augmented Lagrangian function is defined as | 


m p m 
L(x, À, u) = f(x) + 2 Àigi(x) + A ujhi(x) + =a (X max{0, g(x} + i h(x), 
= i=1 j=l 
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nethod is given in Bertsekas | 
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Thei -another effective class of methods for solyi = Programming 405 
Ub ject g jems are based On sequential quay ri. ving nonlinearly A 
ilia ethods ean be implemented both in lk a Programming (SOP Trained optimization 
‘Sty : a a a K in line 56arc} 9 (OF j; approach. Th 
taints effective for solving NLP’s Ravine afm an Land trust, region fr MN. khe SQP 
P methods is given it ) 5 Significant nonlinear: ameworks, and are 
on SQ | Nocedal and Writ ninearities. A good discussi 
VLP», , The most basic books on dynamic progr: right {119}, ne 
blem Dreyfus [15]. The book by Bertsekas 19) oaae are Bellman [14] and, Bellman and 
` : , | an NONNE SOE TDF i esents a ve biag A an anc 
a dynamic programming technique to problems of piii a kn i 
Wills al control, 
texts 
i $ 
gwil] | 10.17 Exercises 
es of | 10.1 Starting with (xı = 1,x2 = 4), solve th ; 
tity atin e following problem by Frank and Wolfe’s 
value , 
> the ee 
1969 subject to 
solve xi + 2x <9 
se —X1 +%2 53 
ition 
Xi, X2 = 0; 
10.2 Determine the projection of the gradient of the function f(x) = 5x1 — 3x2 + 6x3 
onto the x1x2-plane. Sketch both, the gradient and its projection. 
10.3 In solving the nonlinear programming problem 
Maz 20 s 1) = i + 16x3 
subject to 
| xı +X2 Š 8 
d xı + 3x3 <9 
x» — 2x3 <0 
[he ; 0 
-jon X1, X2, X3 ZU, 
jely M, ints would Rosen’s gradient 
ca what direction of movement from each of the following po 
o Projection method select? : 
‘ = aonn (o 1,42). 





ii (i) (3,4,2) (ii) 0,0,3) 
Rosen’s gradient projectt 





ae on method, using origin 
MMA TG. a : ; 
1.4 Solve the following problem va 
ls the st 17 ting po į nt 
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Maz ane + X2 + EY 
subject to 
2x1 +x2 <8 
r A (ee 2x2 S 4 
—xı + 2x2 < 0 
X1, X2 = 0. 


10.5 Consider the constraints Ax < b and let P = I -— An (Aj At JAN where Altre 
sents the gradient of the binding constraints at a given feasible point £. What ice 
implications and geometric interpretations of following statements? i 
(i) P(VF8))=0 
(2) EYO) = V (2) 

(iii) POf) +0. 


10.6 Using the analytical implementations of the 
enalt 
following problem f penalty function method, solve the 


Mar 4x—- 2 
subject to 


Aat n 
a = Sa he 
. 9 
À i a = 
E € 
ba a es: 
a pa = 


Qi 
k 
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n Nonlinear Pro i 
0.9 Consider the problem diais o- 
1 
M 2 
i Xi = 2x1 — xy 
subject to 
: AT One ae le 
wy XS 2. 
X1,X2 = 0, 
í by fi ; 
and solve the Sopa A i) a and Wolfe s method (ti) Rosen’s gradient oradtent 
projection method (iii) the penalty function method and (iv) the barrier function method. 


10.10 Show that for the nonlinear programming problem ‘min f(x), subject gi(x) < 
a(i= Le the function B(x) given by 


By (x) = >i, log.(—gi(x)) , 
= 


meets the requirements of a barrier function. The function B(x) is called the logarithmic 
harrier function (Here, obviously, the domain of By is the interior of the feasible region 


of the given NLP.) 
Use the logarithm barrier function method to solve the problem 


Min x 
subject to 
VESI 


amination in three courses X, Y, and Z. He has three 
at it will be best to devote a whole day to the study 
tudy a course for one day, 2 days, 3 days or not at 
t by the study are 


10.11 A student has to be take ex 
days available for study. He feels th 
of the same course, so that he may § 
all. His estimate of grades that he may ge 






sum of grades. 


“Ow should he plan to study $ roduced by building dams a | 


ror p(X), 6 (x1, X2/%3), P 
idroelectric power Ls thi o. 
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; ‘Dor a ae a 7, i . ver im ortant 
a ~ method(IPM). Such algorithm have become very 1Mp 


computational Complexity and Karmarkar’s 
gorithm for Linear Programming 


ne i ee 


11 Introduction 


Wehave already studied the simplex algorithm for solving linear programming problems. 
This algorithm has been applied to numerous real life problems of different sizes and 
„any efficient codes are available for its easy implementation. 

The basic aim of this chapter is to discuss certain complexity issues related to the 
gmplex algorithm and thereby conclude that the worst case computational complexity 
the simplex algorithm is exponential. In this sense the simplex algorithm is not a good 
algorithm because theoretically an algorithm is considered to be a good algorithm if its 
worst case computational complexity is polynomial. Roughly speaking if the worst case 
computational complexity of an algorithm is exponential, then the amount of compu- 
‘ation will grow exponentially with respect to the problem size, thereby making some 
problems of very large size almost impossible to be solved. This observation raises the 
question: Does there exist a polynomial time algorithm for solving LPP’s ?, i.e. does 
there exist an algorithm for solving LPP’s where the amount of computation grows like 
apolynomial, with respect to the problem size? This question has been answered in the 
affirmative by N. Karmarkar in 1984 when he proposed the projective scaling algorithm 
for solving LPP’s. | 

In ay literature these are many variants of the original Karmarkar's egori E iNo 
u our presentation here, we shall mostly stick to the basic Ra R pe ope A y 
A arkar. Also at places, we may not be very precise pee. a ove tia am 
vill be made to understand and clarify various ue involved in the develop 


Karmarkar’s algorithm. ) 
_ One important aspect of Karmarkar’s 


z 
i 


algorithm is the fact that it is an intertor 
recently because of 


, rogramming. 
E E E S o i d second order cone p 
applications in semidefinite programming an 


es 
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11.2 Simplex Algorithm: A 


isti j rogram 
The most basic characteristic of a linear : a 
jective Aad constraint functions are a Ep i 
of linearity guarantees the following for 


vex set. 
a ‘on of a LPP is always a dP 
ae “ope LPP has an optimal solution then at least one corner point of the 





ming problem JS the fact that 


; itg 
s of decision variables. This sț 0 


TUCH 


: TE timal. : - 
r local opti solution of the given LPP is also a global optimal solution, 


Because of the property (P2), it makes sense to concentrate only on ric “a Points 
of the feasible region, i.e. given the LPP, we determine all corner a SO the feasible 
region and then choose the one at which the objective function pees optimum. 

Since having all corner points at one go may be difficult, Dantzig proposed the 
following iterative procedure called the simplex algorithm 
Step 1 Start from an initial corner point. 

Step 2 Check if the given (current) point is optimal? If yes, stop; otherwise go to Step 3, 
(Here it may be noted that this step essentially involves comparing the current objective 
function value vis-a-vis values at those corner points which are joined by an edge with 
the current corner point). 

Step 3 From the current corner point, move to an adjacent corner point (that is, a 
corner point which is joined by an edge with the current corner point) so that the 
objective function value is improved, and then go to Step 2. | 

Thus the simplex algorithm moves from a given corner point of the feasible region — 
to another corner point which is joined by an edge and the ob jective function value is 


improved. Since the number of corner points is finite the simplex algorithm is a finile 
werative procedure(except in certain rar 


optimal solution to the given LPP, 
big this finite number is, and How 







nd replacing the 


¢ . é j 
TT reTi word ‘corner point’ in Steps 1-3 by the word ‘bast 
seta basic feasible solution is an equi 

A ae computed (at least theoretically 

n the given LPP. The an 


Hobe 
a 
o We 


corner point’, it c 
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) no matter how many vat A. 
computational details of 
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IA maea Sy eos Ti 
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ts l 
ab. consider the linear programming proble 


letur, m 
: Max ze gly 
$ A Subject to 
xf Ax =h 
oint ie a 
Sib]e where € © R”, b € R" and A = (aij) €eR™ a (mxn) matrix, with b > 0 and Rank 


Az m(< n). We further assume that the data, namely the elements of c,b, and A 
_D, are 


1 th ational. In this case, we can multiply b 
e y a Suitabl > a G 7 
appropriately so that all data is in integers only, € integer and scale each variable 


We wish to answer the question: How much computational effort is involved in solv- 


ep 3. ing any instance of such a LPP? Theoretically such questions about an algorithm are 
answered in terms of their complexity. 


Ctive 
with e 

Definition 11.3.1 Let f(n) and g(n) be functions from positive integers to positive reals. 
m Then f is said to be of order O(g(n)), written as f(n) = O(g(n)), if there exists a > 0 
ie such that for large enough n, f(n) < ag(n). 

In complexity theory we are interested in the behavior of algorithm when supplied 
gion with very large input, that is, we wish to know how does the amount of computation grow 
ue iS with respect to input size. Thus our interest here is not in the amount of computation 
inite for a problem but rather in its growth as the inputs of very large size are supplied. 
ean Using this notion, the complexity of an algorithm may be expressed as in phrases 
How like: the time complexity of the given algorithm 1s O(n?). To have SONAS idea about the 

growth of certain functions, suppose that for n = 10, there 1s an algorithm of order 

ple- O(log n) that requires a full hour, an O(n’) algorithm that stam a ce macy 

[his of solving any instance of the given problem. It can a peas will require 69.4 days 

asic the O(log n) algorithm will require 2 hours, the O(n ) g ay arent 
j 4 z 17 ‘acl As problem size becomes large, su 

asic | and the (O(2”)) will need 10” centuries: p ferred to O(n) and O(n?) algorithm 


the differences clearly make O(log n) algorithm much pre 


much preferred to O(2") algorithm. 









algorithm as 4 function of the size of the input of 
go 


We lexity of an i e have to encode it, 
a f symbols over SO ‘thm is represented as 


‘at is, to represe nt it as a sequence i Thus inpu 
OPexamnie we can have binary encoding: oi 
aave ae We define the size 


f the input as the length of this 


4 - Bo 
5 Yili Ji 3 E ‘> t 
P = eo . d 
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ogramming problem (11.1), the data is m y 


to , 3 ki. 
In the a of problem (11.1 ), we have to find the total length of the ang 
b. To define al data m,n,A,c and b is encoded in terms of binary numb ong mit 
when the integr e the smallest integer which is more than or equa] tod T 
Mi 


ber v, let [v] denot 
ar (6b called the ceiling of the real number. U TE a +. number of w 
in binary encoding of an integer p is [(1 +log2(1 + |p|) |. So when all the data of the Pi 


LPP is integral, its size is defined as 
size = [(1 + log2(1 + m)] + [(1 + log2(1 + n)| + ya + [logo(1 + Icj|)]) + 


J 


Y (1+ floga(1 + Ibi) + X, $ (1 + [oga(d + laij) 
i L E] 


f the linear pr 


Since our interest is only in finding the upper bound on the computational effort 
involved, we need not compute the size of LPP (11.1) exactly. Infact some lower bound 
L on the actual number of bits required to record the data is enough provided we 
show that there is a polynomial growth in the overall computational effort which ; i 
increasing function of the number L. Therefore for the LPP (11.1) we take ks, 


L = [(1 + logom + logon + da + log2(1 + |c)|)) + ae + logo(1 + |b;l)) + 


j 
d, jae + log2(1 + |aj;|))] 
ir iy} 


-rr SS ye 
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ct 
a5 
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h a know . TtTeonedrkh 

+ a known feasible basi 

na a A T OLS ; 
required before the mes fg the given BEE i) trary), how m 
E tea CE Simpler a, gorithm pire ( arbitra Ap ho any 
ee an Sure T tle PSY erminates ? i 


TAr a 
i wT” an, — . eo 
. any > w L A AEN A 
? y “hi LIT ya- Q UOCSfilinansn e 
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1A 0 4 a7 
NS Wil 
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D272) f 7 3 ots cacti, k 
PWOL steps are 
(QO i } 
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7 4 more meaningful quest:0? 
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on the size, what is a mathematical 
| : Pi Cat upper b 
for solving any LPP(11.1) of that one | ound on th 
at si e number of pivot ste 
‘ps 


f E 











r sible basis?’ Size by the simple | 
a ptet alqo l : 
x 2 d nS “ O X 
rits vof the simplex algorithm. Thus the UTA as the worst case computational com 
ren i aranteed upper bound on the computation re computational complexity SrOvides 
s . us aR heels | al effort needec l | 
ppP(tt-t) by the simplex algorithm as a function of Mags i ie i ek 
S size. 
ition 11.3.2 (Polynomial Time Algorith 
. js said to be a polynomial time algorithm AE An algorithm for solving a prob- 
worst case computational effort required for mi A Wil en Sd 
a in i 
algorithm, tS bounded above by a polynomial in the Ei ‘ke pate bis wanda 
m. 
Jn other words, if L is the input si 
rt Bey equal to af(L) f p ze and the number of arithmetic operations is 
nd Jess than q or some @ > 0 and f(L) a pol 
L, then the algorithm is a polynomial time algori polynomial of fixed degree in 
an BN arnount of computation: algorithm. Therefore for a polynomial time 
E ; pu ation is never greater than some fixed power of L, i.e. 
alt, a > 0,b > 0, no matter which problem is solved. 
Now in context of the simplex algorithm, there are at most ”Cm vertices that the 
algorithm could possibly visit. But 
SF n! HEE n—(m-1) MIN 
Snn N _ (AE) > (=) 
m!(n — m)! m/\m-1 m —(m-—1) m 
which is at least 2” whenever n 2 2m. Therefore it makes sense to think that on some 
problems, the amount of computations needed could be of exponential order. This is 
sis indeed true as shown by Victor Klee and George Minty in 1971. We discuss these details 
d in the next section. 
le 
S, ; . 
ic 11.4 Simplex Algorithm is not a Polynomial Time Algorithm 
e ! 
TAF. P having 2” corner points 
yt We shall here show that for every n we can construct a LP pAn aikee 
such ; : `o forced to visit each of these extreme points. 1 
x i h that the simplex algorithm 5 i f an LPP for which the simplex algorithm is 
e prove that for every 1, there is an instance Of an tational 
F ; -orations and thus the worst case computa 
e forced to take exponential number of itera sisal 


a : one 
= er complexity of the algorithm will turn out to be exp 


ie 
= 


feasible solutions of the given 
o be an isotonic path if 


| S be the set of 
xP) 1S said t 


De inition 11.4.1 (Isotonic Path). Let 


2 
et), x),.. 







ence of po 
wr te aar 


feasible solution) 


ner points i.e. the basis matrices 
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e column 

of the corresponding basic feasible solutions pe aval av for the case of min; 

(iii) as we proceed the objective function value 1mp , inimi 

tion, 2(x) > z(x®) > aay > > z(x”). 

a) x2)... x}, the number of points in the sequence fa 
Aai length of the isotonic path. Thus for the Sequen 

tonic path is p. Note that if an isotonic Path ig 


de to follow this path by making aPPropriat, 


Given an isotonic path (x0, x 
cluding the initial point) is called the 
(x) x) xO, xP} the length of the 180 
found in S, the simplex algorithm can be ma 
dropping and entering vector selection. 


Example 11.4.1 For the linear programming problem 


Mar z= 4x1 + 3x2 
subject to 
Xp +xX2 <8 
2x1 + X2 < 10 
XK, x = 0: (11.2) 


Identify all isotonic paths of length 2. Is there an isotonic path of length 3 ? 


Solution This problem has already been solved by the graphical method in Chapter 2. 
The feasible region has four corner points, namely, O, P, Q and R (see Fig 11.1) with 
objective function values as z = 0,z = 20,z = 26, and z = 24 respectively. Therefore if 
pectively then {x {) x”) 
there is no isotonic path of 


we denote these corner points (0) x1) (2) 3 
and {x p as x07, x, x), and x8), res 
length 3. 


,x®), x2) are two isotonic paths of length 2. Also 





4 
s 
` 
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i 

ot an isotonic Wie | 
ath $ 
= 26 > 0 = 2(340)) because these are not Hie 


The Klee and Minty Cube 


We shall first give the construction for n = 2 and n = 3, which will then lead to the 
construction of the desired LPP for general n. 


problem 1 (n=2) 


Min 2.0) ate. 
subject to 
ESH St 
ea S 7 = Sex 
ipo) 210: (11.3) 
. where 0 < € < 5. H 


sa 


Ves 

-g X E 
E Pis a 2 724) 
geg 


K ’ ais 1 ' 
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ible solutio 
ae yO) ae (6,1 Es ae ints and j 

4) i 3, So there are four corner pols & ae 


he forced to visit each of these corner points, 
y be 


416 ns) for this problem are (see Fig 11.2) x0) 


d (x), x) | x2) x) Constit = 


ints (basic fe 
_~ (2) = (yd 


Worst 


case the simplex algorithm mä 


Problem 2 (n=3) 
Min —X3 


subject to 
e <x sl 


EX] SX2 <1- Exi 
exp £ Xa S1-€%2 
Xis X2,%3 2 0, (11.4) 


where 0 <€< 5. 7 | 
The set of feasible solutions for this problem is slightly perturbed cube in R°. This 


problem has eight basic feasible solutions given by x = (c,e7,€), xD) = (1,€,€2), 
x®) = (1, 1-e,€-€), x) = (e, 1-e,€-€°), x) = (E; Ee 1-e+e?), xO) S (1, EE, 1—-e+e?), 
x = (1,e,1-e2), x = (e,€?,1- e°), and (xO xD, xP, xO, A), x) x® x} constitutes 
an isotonic path of length (2? = 1) =7. 
Problem 3 (general n) 

Min —Xy 

subject to 

CESS al 
EX S%S l ex 


EXn-1 SX, < 1 — Xa 


where 0<e<} 


. The above problem can be rewritten as 
Min —Xn 


subject to 


0 Q<j<n) 


S S$ al = 

VA | ~ YF. ~ 

ys Hy =1 05 7 Uae a 
| ~ ws 
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blem has 27 equations for | 
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and 3) i 
S . n varial 
: tions are integers ables, Tag: 

S sonst on kee and absolute Taking e = the coeftini 

t that there are 2° basic feasible SOluti iia bounded bs 4 Coefficients in the 
, a = AO S š y ‘ e) j s 

gow length (2" — 1). Therefore in the wor a (corner Points) and i; re a it can be 
pach ar Hote ; -~ MESU Case nt a there is an į J 
of these corner points and thus takes ex ji the algorithm may be an isotonic 
iş example was constructed by Klee a. Ponential number of ite e forced to visit 

Y Klee and Mint T OF iterations, 

and Minty cube. y and the 


other problem, similar to the 
| demonstrates that in the worst Case, the com 
| rithm is exponential. Here the entry aite af 
| ete most value of (zj—c j) rather than the first ne 


l . We describe the pr 


Klee a i 
and Minty ent 
nty e can be constructed which 
a complexity of the simplex 
e simplex algorithm is taken as 


Sative value of (z;—c j) encountered 


iy the Klee and Minty cube) 
as iD oblem in the following 


t 
A Variant of the Klee and Minty Cube 


We consider the LPP 


n 
Max Li be 10" Ix; 
j=l 
subject to 
i—1 
29 10°F xj|+x <10 (1<i<n) 
j=l 
xj 20 iS) (11.8) 


After adding n slack variables in the constraints of (11.8) we note that, there are 


dn variables and n constraints. We can further show that there are 2” basic feasible 


solutions and if the negative-most entry criterion is used, then each of the 2” basic 


feasible solutions will be examined for optimality. 
For n = 3 the above problem (11.8) becomes 


Max 7 = 100X1 + 10x2 + X3 


subject to 
Ro 


20x; + x2 S$ 100 


200x1 + 20x2 + %3 < 10000 
X1,%2, B= 0, 







(11.9) 


depicted in Fig A ill be the 


| | oJ ii : as j W 
e erion and corner points as thod then following 
ASIDIE reglo an “ by the simplex me 

lige Jeet Po, 


- —— 
> above pro 
= — 
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‘3 0: initial solution 
p: Optimal solution 








Fig. 11.3. 


y” y” y®) y“! ) y2 y's3) 


e y= 1 0 0 1 0 0 
s2 = 100 i O PA 05 0 0 
s3 = 10000 20 0 1 1 
0 TE LO 
T 
The b.f.s of the subsequent iterations of the simplex algorithm are summarized in the 


following table 





Iteration no 
esate in elie], as. 































51 = 1 s2 = 100 ss = 10000! 0 

%1=1s2=80 s3=9800 | 100 

%1=1%2=80 sz = 8200 | 900 A 

e aa 00000 È 

Beg a s = 8000. | 9000 wi 

Ta Se e oa 
1, 92580 x3 = 9800 [99100 


—— =1 = 
5 =op RS Sa 100 X3 = 


solut: ti Ol ns n 












Path of leneen e and {x, x, 
=e sth 2 EN È -1= A This ie starts from 3 
Corner point depicted in Fig. dis 
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3.5 Karmarkar s Algorithm far Diii D Programming 419 
ming 


his section we present the basic ideas of the 
narkar [91], which is a polynomial time algorit aling algorithm due to Kar- 
, more ON the geometrical understanding rather than nae Here the emphasis 
derivation. e detailed mathematical 


projective sc 
hm for solvin 


m mentioned by Karmarkar, his approach is based on two fundamental insights, namel 
: y; 


i) If the current solution is near the centre of the polytope, it makes sense to move 
in the direction of steepest descent (when the ob jective function is to be minimized). 
(i) The solution space can be transformed so as to place the current point near the 


centre of the polytope in the transformed space without changing the problem in any 
essential way. 

We can see the first point in Fig 11.4. below. As x is near the centre of the polytope. 
one can improve the solution substantially by moving in the direction of steepest descent. 


But if x) is so moved, it will hit the boundary of the feasible region before much 
improvement occurs. 


2 


n the 










Fig. 11.4. 
-sinal idea which 
qa | and original i 
ka. eea N L A observe that when we 
The second point is not SO hie e. In this co yae the problem. The 
- a E S ST reall | : a sense overspecify p timeter 
o a yen LPP, We» sch from meis OT as 
defining the 5 ' Vet the transformation of 
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roblem essentially u 

l ample, may reduce nume) y. 
s anse he bserved that there 18 à transformation of the data more cone i, 


that leaves the P rical instability. 


It occurs every time one views the feasible a 
The projection of the feasible region on one’s retina : 

distortion of the original problem; it is a special case of Blea a transformation 
projective transformation maps the lines to the lines but angles and distances change. A 
straight lines remain straight lines, while angles and distances change, under a Project 


transformation a polytope will remain a polytope but its orientation may change. 

A key property of projective transformations is that a suitable one will move g Doi 
strictly inside a polytope to a place near the centre of the transformed polytope. r 
can verify this with Fig 11.4 by viewing it an angle and distance that makes x) e 


Karmarkar has 0 
ordinary rescaling but equa 
of LPP at an oblique angle. 


lly natural. 


to be at the centre of the polytope. 


The Basic Strategy 


The basic strategy of the projective scaling algorithm is as follows 


(i) Take an interior point. 
> ooe the space so as to place the point near the centre of the polytope 
e transformed space, move in the direction of the steepest descent r nel ll 
a 


For performin 
g above steps, we need i 
; th . 
we call as Karmarkar’s form, described Baines ee o specified form wine 
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jimita and ¢ a (1,1 

. . j. j T 

tollowing two © Hida ai oh jra wuy “ 
| on problem (11 10). 

8 feasible. (that is, Ax aÔ is 


Other const raint ely a 
X= 1 is satisfied automatically for 


2. The minimum value Of Karmarkar's P 
5 Problem (11 10) is zero; that is, 
are various ways of putting a ge 


l ‘s algorithm. The 


neral LPP 3 : 
4PP in the desired form so as to use 


that gi the trick for meeting As- 
le, using duality r mat gives Min z = 0 involves more tedious 
is, for example, using duality results of LPP and adjustin ) 


| centering transformation does 
i l : E e l g dual variables suitably. 
We shall explain this by an example in a later section. Before : 


‘on 1. However, the transformati:.), 


es ie we discuss the centering 
ation om r details of Karmarkar’s algorithm, we shall like to visualize the 
| jeasible region of problem (11.10) geometrically and also the centre e/n of the polytope 
the original space. For this we consider the following example l 
Min 3x} =D AX 
subject to 
—X1 + X3 =0 
Xi +X + X53 = 1 
X41, %27%3 & 0, 


and note that e/3 = (1/3,1/3,1/3)! is feasible for the given problem. Also the set 
((xy,X2,73) 2 X1 +X2 + X3 = 1 , x1, X2, X3 > 0} is the outer face of the unit tetrahedron in 
R’ or to ne precise the simplex S2. Therefore the feasible region of the given problem 
is the intersection of the plane —xı + x3 = 0 with the outerface of the tetrahedron as 
shown in Fig 11.5. = 

Geometrically the situation is very similar for problem (11.10). The set {x E RAs 
dy = 1,x > 0} is the simplex ga and the constraint Ax = 0 represents a pel ps 
m hyperplanes passing through the origin. Then the feasible region nt prob a ( B 
is the polytope which is obtained by taking the intersection of these hyperplanes 


the simplex S”1, 










caro l ‘acti atrix 
0 Centering Transformation and Projection M 


nal idea of Karamaar’s algorithm, 
of the centering transformation 1S 


j! ——_ igi 
Us section, we discuss the most novel and orig 


tering transformation. he ace on to the center of the polytope m | 
‘ble point in the Orig oa ae ‘ve transformation, & polytope 


wW l im < Sah Hf % > > 
Lemans ob 0J ect i | 
10 tro) VS. ‘Orme LIULL > Bai a] epee 5 ss t 
ie another polytope 7° 


ri 


he transformed space 
TAS Aach a fev 2: : 


“Ss 
T pee » P 
j iz T =r 6 f T s 
a Bis 7 — a" g 
aa aa j = 
- j g gi 


P +nan 
>, + CYT] LJ CAS 
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Fig. 11.5. 


Suppose that the current feasible point x is strictly joanne that is, all com 
ponents of the vector x® are strictly positive. Karmarkar’s centering transforma- 
tion is a vector valued function 0,0 : SH} — Sie given by 6,«@(x) = y where fi) 
s" = {x € R” : e'x = 1,x > 0} CR" is a (n — 1) simplex. 


Dy af E ®) 
(ii) y = Tpi bra E See R? where D; = diag(x, AS an yee Dr = 
e ar 
k 


diag(1/x\",...,1/x), and x) = GP tex)? Here it must be noted that the map 
ping Ow is defined with respect to the given strictly feasible point x®). If the point x” 
is changed to a new strictly feasible point x+) then @ will also change because the 


matrices D} and De will change to D41 and D7! Infact, this is the reason that we 


k+1° 
have used the notation Gig Gt Sie given by 0,(x) = y for all x € Sug How- 


ever, if there is no confusion then we shall omit the subscript x“ and write 0,0 as 8 only. 
Properties of the Centering Transformation 


The centering transformati 


on has the follow; 
(i) The image of a strictly wng Properties 


feasible point x) 
Beene = Space, Le. O(x(*) = GNA 3 
kz os SN Das IA a eTD-lylk) TN n` 






. e X 
Is the centre Ta of the simplex Sy M the 
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Ji it TIO rai kw = 
+- AMaGAYDDING O WA 
Euf uana a Wil Lf 


by . 
he simplex Sr-1 
LLG SS, Yhinw tC /t- : 
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re 4 | J pg 4 


— f p _ a, ar 
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“a ning O. This is t 
0. Therefore, 
AD 7 





tation the origin | > ag 
We illustrate some of these points with th gin but has different 
€ help of the following example 
Mi 
‘ 3x1 = Xor-£ Xa 
subject to 
2G ets N) 


XI +X +X% =] 
r X1,X2,X3 > 0. 


Here Fig. 11.6 shows the feasible region in the or 


iginal space and the current point . 
; p ; point 
7 = (2/5,2/5,1/5)° which is not the center 3: Using the centering transformation Ce 


we get the new feasible region as shown in Fig. 11.6. Here a(x) = = and the plane 


-xı + x3 = 0 gets transformed to —2/5y; + 1/5y3 = 0 under the mapping 8. 
The Transformed Problem 


Under the centering transformation, the original problem (11.10) 


Min A= 
subject to 
Ax = 
elx=1 
Tce (0) 


et . ing probl 
gets transformed into following linear fractional programming problem 


K F i c" Dry 






Min eT Dey 
subject to | i 
(ADp)y = 9 A 
Te (11.11) | 





y 20, 
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space problem (11. Fà grini O De vary angi 
rmed 3p i l 1.11) is no more a |. pp vag Satisfactory because. i l 
mg the ratio of two linear functions is not LPP, The objective functi , in the 
, » 18S not eyg _ eevllve Tunction of (11, 
ass of generalized convex functions to be ied a (in fact it is psuedo atom ; 
. aN J a @ : S T, © 
assumption 200 problem (11.10), the minimum y, iy 
as the denominator of the oh ject value of 


ae But because of the 
ee problem (11.11 
P i lve functio) : ) 
away from zero for all y € SI“, problem (11.11) is eam 


the 


is also zero. 


i is positive and bound 
IS equivalent to a 


Min (Dyc)"y 
subject to 
(AD,)y = ‘() 
ery =z i 
Y= 0; 


io De eee bY whey 


(11.12) 
Now problem (11.12) is a LPP in the transformed space. In fact it is in the form of 
problem (11.11) with both the Assumptions 1 and 2 being satisfied. 

Since, in the transformed space we shall be solving problem (11.12) and NOT problem 
(11.11), for problem (11.11), the sequence of iterates {x} as generated by Karmarkar’s 
algorithm approaches the optimal objective value zero in the limit as k > +oo, but 
not necessarily monotonically. Hence in practice, we terminate the algorithm when the 
objective function value of problem (11.11) gets sufficiently close to zero. 


-a cam 


Projection Matrix 


e. 
The next major step in Karmarkar’s algorithm is to move from the centre -in the 


direction of the negative gradient. Since, in general, the negative gradient may point 
in a direction away from the feasible region, we will need to find the projection of the 
negative gradient on the subspace given by all vectors x e R” for which Ax = 00r 
(AD;)y = 0 in the transformed space). So we should obtain an expression for the pro- 


j ; iecti have 
Jectior i find the desired projection. Though we 
jection matrix which should be used to Te eee oe 









= {x e R” : Ax = 0} be the null space of A. 
of A. Then M L N and R” = MƏN; that 
a : v with u € M,v EN. Let gk = V f(x) 
„| MY vector of R” c nans movement we have to project 

“a9 be the nega’ bive grad i Da Sri nstraints(that is, M). 


^P 
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da 
ning @ e (0,1) because ——__ yee Y of the polytop 
vin(n = 1) radius of the hee 
ped in an (n — 1)-simplex. Some re E gest 
in fn=3 aders may like ; l 
jor the oO" i 0 verify t 
P=(0,0,1) 
Q: (1,0,0) R=(0,1,0) 
Fig. 11.7. 


Remark 11.7.2 As explained earlier in Section 11.6, since we are solving a LPP in 
the transformed space rather than the equivalent linear fractional programming problem, 
the algorithm does not guarantee a monotonic decrease m the objective function value a 
problem (11.13 ). However using the concept of potential function for both, the 5 aes 
and the transformed problems, K armarkar proved that eventually the objective function 
value c'x) for the current solution x will tend to zero as k > +00. 

he subsequent iterations, the iterates x“) 


stop will still remain in the interior of 
a LPP is a corner point, we must 


Remark 11.7.3 Since initially as well as fort 
are strictly feasible, the point xP) at which we 


- feasy s the optimal solution x af ; a1 time. This procedure 
el Seam bate us from xP) tox, again M Pp ue A oe after 
i. i. f a step’ or ‘optimal pore Oe amt 
ts calle ‘purijuca ma 
. UN working of Karmarkar s algorithm for a § 


’s algorithm to solve the following LPE 


aei 3 8 





Example 11.7.1 Use K armarkar 
subject to 


Xn —%3 = : 
Yee XS a 
Xa + X2 
i 4 fa B S) X3 Z 0 


i Te AV T Eaa 
(ee ete? ui GA 
i 9° 
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Karmarkar’s algorithm, J 


olving the given LPP by 
we start solving the 8 (11.13) with both assy 


h 
| the form of problem ave ta 


Solution Before : ng 

make sure that the given LPP is u Mpg 
ing satisfied. da 

pr aa ven LPP, we note that by taking x = (x1, X2,x3)!, A = [0, L “1), 


Looking at the gi Be oiT t 
(1,3,-3)' the problem is in the desired form. Also 3 is feasible for the given Proben, 


e e ae 
because (5) = 0, e"(3) = 1, and 3 > 0. Further the minimum value is zero Which i 
attained at (X1 = 0,X2 = +, X3 — x): 


First iteration of Karmarkar’s Algorithm 


Step 1 Start with x = (3, $, $)". Take € = 0.1 and set k = 0. 
Step 2 x“ yields z = i > 0.1, so we go to Step 3. 
1 
0 O 


3 
Step 3 A =(0,1,-1),D, = diag(x®)=/0 4 0], AD, = (0,1,-1 
0 1 


aD] fo } -1 5 
B= k = 3 3 = ` 
p l! : al BBT f T (Bry =| 1 
2 See 
P- pipip- i 
ieee”? 
3 er 6 
c = (1,3, -3)7, Dic = G 1 -1) 
PD) = q =(2,-1 1 ; 
9’ 9 


and 






a 





+ - y PE m 
- n x $ GR 

t sl J 
} a! i A » VN: i 

an. A q ` 4 
a 
f= i a A 
4 > T 


7 x) 







will be Same as y”. This will not 


O 
FTP ? 7 Oe 





oo ( ¢ 
£ è y 
Te 
k j- Aa. ri 
val 7 4 + d 
å - Å B 
a è a 
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f m r od r m aga r c T 
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a e. : a = =! 


Scanned by CamScanner 









© tational Complexity an 
: y and Karmarkar'g 7.1 AOR 
computa KOrith 


nd iteration of Karmarkar’s Al "for Linear Programming 429 


ger 


1 
a 44 
wa in. = Lis x(k) — j- 0 
ye Ob 1) k diag(x )= 0 S 0 AD 
: s Osida A k = (0,§,-2), 
aD = |a ; A pat -[% o 
R= e 1 1 i m 0 g|? ay =F i 
= ai 1 EE 
3 -3 3 
Eoee i 2 
i ogee ae 
9 
T = (Sar 
| nears?) , Dc Gas 4 
i 
nDo = = Ng a oe) 
Using @ = I we get 
azt- (2 3 = 
Yn Jnn—dDllepll \4’8’8) ’ 
and 
(2) T 
= O-(y) = vie (=, ea =| 
eT Dy” Ta 22122 
Also 
9 2 
TADS Le = — =().18 
ca +3(55] a 11 
Thus we have 1 
osh 1 5) ©) 9 — 203 
3) Ce 3 
me?) 7.0) — 1-025 
x ,=,=|,c x 
8 8 4 2 


= (= 2 Aa _ £ S058 
ii 22" 22 11 
Purification Step or Optimal Ro 


e x) such that ¢ 
a corner 


unding Scheme 


Typ) < aL. The purification step 


nal iterat point (b.f.s.) which gives at least 


Here we are given the fi 
scheme determines 


or the optimal rounding a basi 

re “Sa T mC already a basic 
] the objective function value © ¥" are binding a 2, eee in the null 
N nearly independent constraints j E'R“, (d # 0) ng traints 
feagih ln ey ue j G we can system of equality ee, ` 

“aSiDle sol auon. Otherwise; À i j P) in the di- 
of binding constraints, 1-°- 


wee: AR > 
c w ii an ih VU 
pa >) et p ay > $ LL ait 3 


n M R eee 
gl: Se AL] a 
: ey 
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)a new solution 
t is binding. Proceeding in the and ‘ 
1s 


jon i btail 
region Is bounded. Thus we o | 
least one additional linearly independent constrain 
n x to problem (11.13) can be obtained such that T~ "ên 
h steps since it begins Ai X < yu) 


feasible solutio 
e that the process in 
nt equality cons 
dditional linear 
of polynomial comple 


s n- (m +1) suc 
) in problem (11.13) bindi M+) 


binding constraint to this g 
Hence x is determined fro aie, Gach 
© fing | 


volve 
traints (binding 
ly independent 
xity. 


ner, a basic 
Here we not 
linearly independe 
adds at least one a 


step. Also each step 1s 
iterate x”) in polynomial time. 
he final iterate be x2), We observe th 
at only ; 


In our example (Example 11.7.1) let t 
(2) A 
x (as x(2) = 0, eT (2) f 1) 


linearly independent equality constraints are binding at | 
Hence we determine d = (dı, d2, d3)! + 0 such that dz - d3 = 0,dıi +d | 
J 2 + i 

d3 =], 


this gives d = d3 and d, = —2d3. Takin d3 = 
l g dz = 1 we get d = (-2,1,1)". Si 

d ea Sh = a= 2 < 0 we move along the direction of d E (C Since oie 
f= xO ad = (|, =, =) + a(—2,1,1) = (= Sys 2 3 o Le, 
T 22’ 22 ae Nil ateate) AS h-ag 


1 
we get a < —. Therefore for a = 
y = — the constraint x; = 0 bl 
> ocks the moti 
10N and jt 


becomes the third (lin i 
early independent) e i 
quality constraint which is bi 
is binding. H 
- Hence 


weget t= (4-4 sti) Wee Med 
MEOD 22) (0 De 5) is the required corner point soluti 
lon, 1] 


11.8 i 
Putting a general LPP in Karmarkar’s f 
s form 
We shall ill i 
lustrate this by an example only. Let the gi L 
given LPP be 


M = 
a Z= 3x1 + X 


subject to 
2x1 -X2 < 2 
X1+2x% <5 
To put X1,X 
problem 11.14) ; 1,X2 2 0. 
Perform the a 14) in Karmarkar’s f . (11.14) 
ng steps orm (i.e. in the form of 
m of problem (11.13)) ve 


P * Write the dual of (11 14), ie. 
a os my) 1G: 


= ee 
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ity theore 
d g From duality theorem, any feasip] 
at gteP : | soluti Dle Solutic r 
’ d the optimal solution of (11.1 "ON Of the f oe 
Nes il yiel 4) s lowing set of constraints 
ts, | Sxi Ea = 241 = Syy = O 
bp 2x] — 232 
nal S 
2y +y > 3 
Wo T + 2y > 1 
1). X1, X2, , > 
i ; var Y1,Y2 = 0. (11.16) 
aan tep 3 Intro uce Stack and surplus variables in the syste 
= | S ystem (11.16) to yield 
D | 3X1 + x2 — 2y1 — 5yp = 0 
| 2x1 — x2 +8, = 2 
at Xi + 2x0 +8 = 5 
ice 2yi+Y2-1 =3 
N. -Yy + 2Yo - 12 = 1 
X1, X2, Y1, Y2, "1, 12,51,82 = 0. (11.17) 


Step 4 Find a number M such that any feasible solution to (11.17) will satisfy ) x; + 
7 y+} si+}, r; < M and append this constraint to (11.17). In our example if we assume 
that all variables have an upper bound of 10, then x; + x2 + y1 + Y2 +51 +52 + r +2 < 
(10x8) = 80. We then add a slack variable (dummy variable d1) to this constraint. Our 


new goal is then to find a feasible solution of 


3x1 + x2 — 2y1 — Y2 = 0 

2x, —X2 +51 = 2 

Xi + 2X2 + $2 = 5 

1) 2 Vo = 3 
je -yı + 2Y2 — 12 = 1 
pol we eae + ay an ty tte AA 
X1, X2, Y1 Y2) 


fy, 1281/51 Z 0. (11.18) 


1. We use this variable d2 


— vari t constraint) equal 
Step 5 Now define a new pea R.H.S of (11.18) (except the ae d) = 1 to each 
(whic is equal to 1) to E pen tiple of the ee RS. (e.g. we add 
| "zero. To do this we add the nA int in 
| | Constraint of (11.18) (except the last An - pe ieta 

â ne : —xX2T17 
AM = 1) to the constraint oat 

11.18) by the following two constraint 


able dz such that d = 








riate mul 
constraint) having a no 
Zj: WeDo EE 
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t constraint 
(i) aa dy one v8 (dy = 1) from the last constraint of (11.18). 
(ii) Subtract M(= 


We now seek a feasible solution of 


3x1 + x2 - 241 — S¥2 = O 

2x -X2 +51 — 2d2 = 0 

x1 + 2x2 + S2 — 5d2 = 0 

241 + y2 - 11 — 3d2 = 0 

-yı + 2y2 -n2 -d2 = 0 

x +X + 41 + ¥2 +51 +92 +71 +12 + di — 80d2 = 0 

x1 +X +y +y +s +s+ +12 +d +d = 81 
X1,X2, Yi, Y2, 11, 12,81, S2, A1, A2 = 0. (11.19) 


Here dy = 1 and x1 + x2 + y1 + Y2 + S1 +82 +11 +172 +dı = 80, are equivalent to the last 
two constraint (11.19). 
s 6 Make the following change of variables 


=(M+ 1)x;, y;=(M+ ly, s;=(M+ Is’, r,=(M+ Dri, d; = (M + 1)d’ EA 
shee yields 


3x; +x — 2y, — 5y, = 0 
2x1 = X, +s, — 2d’ = 0 
X1 + 2x5 +s, — 5d} = 0 
a nspd 0 


KUM tht esaraned - 80d, = 0 
TTAB HS er tr eg ad, = 8i 
% Xa Ya Yay 1,8, dt, > 0. (11.20) 





V ensure that a 
| ae Rist na “a point that sets all var 
a 4 NEA Aaa ae Date ab VY + i . 
a Marie ao kz variable d3 (artificial 






a is feasible to (11.20). For ie 
variable) to the | 11.20 
le of g e last constraint in ( 

to each of other constraint. This multiple is chosen 9 


efficient of all 
jeld ch ‘Constraint (except the last) will 4 


or © E 


n f 
N t 

\ f A ~ + 4. j P 

Sum of the. 

VEAIN 





Vg A 
rn aria ibl es in ez 
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SX! + x! a Day 
Lie 2V; = 5y, + 3d! = 9 


/ 
a ae Sa Ea 
/ 1 2 3=0 
% T 2y als / , 
I la E i) 
oy? d-a 
1° Y2- ~ 3d +d! = 0 


ERE p , 

Yt Ay = a 
me a aE te ay / 3 
| CEL 4 97! ws 
/ 
1 2° Ui + Va + +55 + cet edad! a din 
ee ese ee 

2 aa ol 43 / r’ / / / , 

a Yı Yo, vx Sy Sy di d,d; > 0. 
. ee A a 

In (11.21), the pomis x, E Xa a Y1 Rž Y3 =r =n = 6 = 5, = d; = d,,= d} = eZ = 

is feasible. Since d3 should be zero in a feasible sol í 


i i A ution of (11.20) we need to take the 
objective function as Min dy, in (11.21). If (11.21) is feasible then Min d. = 0 and 


then values of remaining variables will yield a feasible solution of (11.14) and hence the 
values of X; and x2 in the optimal solution of (11.21) (with Min d, = 0) will give a 
optimal solution of the given LPP (11.14). Then LPP in (11.21) satisfies all conditions 
of Karamaker’s form (11.13) and hence is ready to be solved by Karmarkar’s algorithm. 


(11.21) 


11.9 Worst Case Computational Complexity of Karmarkar’s 
Algorithm 


We shall now show that the computational complexity of Karmarkar’s a is 

O(n?°L). For this we shall make use of the following Lemma due to Karmarkar. 

Lemma 11.9.1 Let x®,c,a, and y**) be as in the description of Karmarkar’s algo- 
nma ar ME EET. 

rithm. Then either 


fi) (Dfe)jy D = 0, or 
E 8 


ihe K nstant (> 0); d 
here Dk = diag(x\ ae Sad xs )) and ô is a co 






p E 
a 


epending on a. Fora = 1/4, 


g 
Q>1/Q 
ic AN "ona 


TT 


T A A | ‘on given by 
Tere g is the potential function 9 
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0 à 
p y, ‘ 
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Pay £ jj T 
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Applications 
(k) i ) ; 
X In (c x/X; is the potential function, f i 


nd f(x) = 


for the transformed cost 4 


true cost. em can be polynomially reduced to 


mmin probl 
g ar’s Algorithm. M oreover in tim, 


ra 

rem 11.9.1 Any linear progr rk 

me form of problem (11 .13) specified for phage her computes a soluti 

e al in the size of the input, the algorithm ether p tOn or Proves 
po ynom 


that none exist. 


Proof. We have already shown how a biti 
format. For proving polynomial time complexity O 


the implications of two cases as given in the Lemma TEOG 


Karmarkar’s algorithm, we Consider 


Case(i) (D,c)y**) = 0. This gives 


cl y(k+1) = 


and therefore we stop. 


| 
general LPP is reduced to Karmarkar’s Algorithm 


Case(ii) g(£) - g(y**)) > ô. Then the relation f(x) = g(@(x)) assures that 


g(O(x)) — (Ox) = f(x) - fe) > 6. (11.22) 
Therefore, (11.22) gives 
fe) < f(x) ~ (+ 16, (11.23) 
Or 
(k) 2 
FOO) FOP) — s Aake, (11.24) 


If at an iteratio onditi j h 
y tion c ndition (i) arises then t e algorit hm will clear] sto t inimum 
y p as he mini 


value of problem (S) 
A -13) 1S assumed to b 
stopping in that way, then (11.24) yields SPN YE SEa R iterations witha 





F(x) < f- Ks 
JÀ OW { ry èj 
at ioe ving the definition OF ff, (11 25) gi 
= m ; w O | . ves 
- a g. l = OA a, cT x(k) 
e ni — d (0) 
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0) = £ : 
pokes = EEE e eer tom a 


| cT ylk) n 
nin k 
T) S ) n(x "kb. 
j=1 


Swe A) fa ' 
oi < 1 for all 7 as x is on the Simplex gon 





(11.27) 


, the equation (11.27) gives 


f= cT x(k) 
I — 
TA) S a (11.28) 


ey + 7 inn i | 
Non choosing k = a the tolerance parameter, we have from (11.28) 


T x(k) 
vg ng+ninn 


T(K) 
OA 

In meee Sy 
Roe ere 


ch x) m n 
(o) < exp(—q) <2 tor A >. 


Therefore, 


Hence, after k = O(ng + nInn) iterations, the algorithm will stop at Step 2. Since each 
iteration requires O(n?) computation to do necessary projections (using rank 1 update 
rule), the entire computation is O(n?>q + n2> Inn) steps, which can be expressed as 


OŽIL + Inn)) where L is the length of the input. 


11.10 Summary and Additional Notes 


d to the simplex algorithm 
e scaling algorithm consti 


) : and an introductory 
* Certain complexity issues with regar Me o 


discussion on Karmarkar’s projectiv 

















e and Minty cube and its variant So as to 


Lae: en 
nomial time algorl | 
[ of various aspects of Karmarkar’s al- 


* Section 11.5 to 11.7 are devoted to the s is presented in Section 11.7, 
While Karmarkar’s centering cs `o given 1D Section 11.8. Section T19 

ton of the mam algorithm i a given LPP m Kar- 
‘based on Teip e computational 
Section 11.9. 


* Section 11.4 is devoted to the study of Kte 


show that the simplex algorit 
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e The Sovie 


‘ ae f R Ja i 
À T ; DH ri ce ' 
Sree Apu 
A i Oy 









i as the first to develope 4 
jcian L. G. Khachian was 53° | Polyno,. 
ong pee LPP’s in 1979. Khachian of anothea Rie » i 
ae I docks its main idea from an earlier work o jade Matheny 
algorithm) derives 15 
cians N a caling algorithm was developed by N arendr a Karmarkar jn 
i Br Bal L bs (USA). Though Khachian’s sets ane Karmarkar’ aly, 
AT and T Bell La s of linear programming problems is in the class i 


ibi the clas igi 
rithm both exhibit that , | be of sienificant NA 
problems, Khachian’s algorithm did not prove to g practica] com 


l i 
tational value. This was because the ellipsoid method requires that calculation, 
i | : , call 
“oi with very high precision. Karmarkar’s algorithm is free of the ellipsoids Precisio 
problem, and its worst case behaviour is also substantially better than Khachian, 


algorithm. | ni | 
From the very inseption, there have been many numerical experiments on LPP’, a 


various sizes so as to compare Karmarkar’s algorithm and Dantzig’s simplex method 
It has been observed that for LPP’s of moderate size, the simplex method Performs 
quite well in practise, despite its dreadful worst case behaviour. The main reason for 
this could be the fact that the average case probabilistic computational complexity of 
the simplex algorithm is polynomial (Borgwardt [24]). 

Although there is no evidence to show that the projective scaling algorithm can beat 
the simplex algorithm by a factor of 50, (as originally claimed by Karmarkar) there iş 
consensus that the number of iterations in Karmarkar’s algorithm grows very slowly 
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B ame 
Mia) ge vA} 
S0iq 4 Consider ine LPP Wi || 
` X? That 
subject to Wt 
: ti nw l 
hh ; My + Xp — Dy, = 0 Hil 
ie Te Ole il ‘i 
7 | LPP be solved di 
be gan the ae ren ont a K armarkar’s algorithm? Give reason for your 
cu | answer" e weration of the said algorithm. 
An's 11.2 Consider the LPP 
3 of Min -x1 — 2x + 4xs 
od. subject to 
Ms X1 —X2 + 2x3 -x5 =0 
for X1 + 2x2 + x4 — 4x5 = 0 
J of X1 +XQ+X3 +X +x5 =1 
X41, X2, X3, X4 x5 => 0: 
sat 
> jS Check for the above LPP 
vly 1. 0) =e/5 is feasible. 
sty 9, x = (0,2/5,2/5,0, 1/5)" is optimal. 
ve 9 The optimal value is zero. Hence perform one complete iteration of Karmarkar’s 
by algorithm. 
r’s 
11.3 Consider the LPP 
i- Maz 2x%+%2 
0- subject to 
is x1 —%2 $2 
¥ w T 2X2 <4 
T- l 
1e Á Xl» x2 > 0 . 
i 7 that Karmarkar’s algorithm 
x Use uality to convert the above LPP in the form (KLP) so : 
S. a 


a e id be used. iMod 
i i TA the LPP given at Ques 3 above. eki ow the 
E RT) a : ; e ; iterations {1 
E ; 4 nath 4 4 -ii E or pls hence determine the agama ae a hy Sede: seca 
| ews case ) stile to get its optimal solution if it has 1 


tion 11. 
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11.5 Consider the LPP Mar 2x1 -%2 
subject to 
My =A T 43 = 0 
xy +x2+%3=1 
X1, X2, X3 z 0 i 


: ble? Let Oz be the centering tran l 
Is the point X = (1/4, 1/2,1/ 4)" strictly feasible? Let Oz formation 
with respect to the given point x. Obtain 


1. 0<(1/4,1/2, 1/4) 
2. 6-(1/3, 1/3, 1/3) 
3. 6-(0,0,1) 

4. 0:(1/2, 1/2, 0) 

5. O(1/2,1/4, 1/4) . 


11.6 Consider the LPP 
Maz XD 


subject to 
A +X2-—x3=0 
A pia — | 
X1, X2, %3 >20. 


1. Is x = (1/4, 1/4, 1/2)? strictly feasible? 
2. Use the explicit representation of the mann; z 
pping Ow: S — S? + 
to be solved in the transformed space. F ; y H 
3. Starting with o perform one complete iteratio 


n of Karmarkar’s algorithm. 
11.7 Consider the LPP 








Max 6x1 H 6X2 + X3 
subject to 
“1 San = l) 
X1 + X? + x%3= 1] 
1. Show that the g; oP eae 
e given 
rithm, gwen LPP meets aij assumptions required to apply K kar’s algo- 
2. Starting with x(0) = Pply Karmar. 


= 0/41 
~ algorithn r *1/4, 1/2)? 
9- Sketch the feasible 

lag 2, ser aM, 


A 
F 
ta „A PK 
HT) As N j 
4 71¢ LE (lie TA A - = 
‘'AULULLE Lope A S, y 
E a t UL á r 










» Per f 
Perform one complete iteration of Karmarkar s 
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yy? 
‘ Conside? the LI } 


TORT At thing $30 


Wa 1 
tii | t Ay. 
Subject to F 
“1+ %2 <8 
F, 
AX) 4 xy < 10 
Xi; x < () 


Obtain the Sure of the above LPP. 


7 b ) ` x 
a jdentify the isotonic path of the maximum length and ind 

- vhat is the marmu number of iterati = irhdicats 
à } i a: | . 'OTLS required to sole 
n pse duality to convert the given LPP so that the 1 
s algorithm. 


the same graphically 
e the given LPP 


sulting pr | 
: Karmarkar g problem could he solved by 


11.9 Identify an isotonic path of length 3 for the following LPP 


Mar 3x,+ 2x9 
subject to 
XI tX SS 
2x1 + x2 < 10 
ea O a 
Xi ZOE 


Is there an isotonic path of length 4? 


11.10 Use the projection matrix approach to find the projection of the vector i- j + 2k 
on the xz-plane. Verify your answer by employing t 


he usual 3 dimensional geometry. 
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me Generalized Convex Fu 
fractional Programming 


nections and 


12.1 Introduction 


By now we must have realized the potential of convexity in nonlinear optimization. W 
have already seen that properties of convex functions play pivotal role t the pe 
and algorithmic developments, in the last few chapters. Some of these properties have 
been studied in Chapter 7. So, it is natural for us to ask, what if we ce away from 
convexity? Can the properties of convex functions be extended to some other classes of 
functions? In this chapter we attempt to answer such questions in affirmative. Beginning 
with the introduction of four new classes of functions, namely, quasiconvex functions, 
strictly quasiconvex functions, pseudoconvex functions and strictly pseudoconvex func- 
| tions, that can be viewed as generalizations of the class of convex functions, we move 
| onto describe their applications in fractional programming problems. 


12.2 Quasiconvex and Quasiconcave Functions 


onvex function is that its level sets are convex sets, 
S — R is a convex function on S then the level sets 
of f, given by Ty = {x € S : f(x) S a}, are convex sets, for every a ER. It was noted 
in Chapter 7 that the converse of this statement is not necessarily true. For -o 
the function E) = x x E R, is obviously a nonconvex function but 1ts leve sets are 
an. TAr ast for any a € R. We look at a little more 
ntervals, (—00, |a|!/3], which are convex sets, lor any 


inyo ved example, say, f : [0,2] > R definpd. oy 


CE KARN < 
a 7 X A CER ; VI- x2, 0 < w ip 1 


| ERRA = ies oor 2), 12x52 
` z JO s 


One of the important property of c 
ie. if SC R” is a convex set and f : 
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aa 1-a 7 ay oa cae 
[a = [Vi-a?, 2 a a>, 
R, 


are convex sets. 
Thus, we look 
terms of the level sets. Thi 


w class of functions that can be completely characteris 
€ . . 
ota s leads to the notion of quasiconvexity. 


ed i 


on). Let S C R” be a convex set and fF. 
if for all x, u E S and for all0<j< 


Y 


l, 


Definition 12.2.1 (Quasiconvex Funct A 
R. Then f is called a quasiconver function on 


uses (ik = Aju) < Max f (x), A 


Analogously, we can define quasiconcave function. 


Definition 12.2.2 (Quasiconcave Function). Let S = R” be a conver set and f: 
S > R. Then f is called a quasiconcave function on S if for all x,u € S and for al 
ETES 

f(x + (1 -A)u) > Min{ f(x), f). 


From the above two definitions, we can observe the following 


(i) f is a quasiconcave function at u if and only if — f is a quasiconvex function at u. 
(ii) Every convex (respectively, concave) function is quasiconvex (respectively, quasi- 


concave), but the converse is not necessarily true. We will see few examples later to 
support this assertion. 


(iii) A quasiconvex (respectively, quasiconcave) 
domain. For example, consider a function 
teger function not greater than xy. AR 

continuous at integer points. 

(iv) The sum of two quasiconvex 
quasiconvex (respectively, 


function need not be continuous in its 
f (x) = [x], where [x] is the greatest in- 
hen f isa quasiconvex function but fails to be 












(respectively, quasiconcave) functions need not be a 


ia quasiconcave) function. For example, take fof R] 
fas fk-1-1 Vania A l 

0, otherwise, f(x) = ni 1-1 gadai 

7 orne . 


_ Then, 


a 





a 










7 Ds 
aa L 2<x<4 


1Q 
oad aa : otherwise. 







I= file) + fn) = 


g aè, it 
n k SO ar ARIA . j ot f 


— i : p 
4f pm ai in erai 
Ba 
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sult is important as it ane.. 
rhe next res I as it completely character 
‘Lerizes the ¢ ) 
tions. 8 the class of quasiconvex 


9.1 Let S R” be y 
rem 12. a convex set 
Dey Ta =X ES: fe) Sab aeR fa aft st R The 


level sets of f are 


give” ; 5 a quasi 
RKS convert set, for every a ER. qasiconver function on S if and only if 
n à 
Assume that f is a quasiconvex function on S, If fo i 
l ‘ii for æ € R, T, is ej 
æt or a singleton set then the result follows trivially. Suppose x mn r e E 
3 set ; a 


= fo) <a, fw) <a, 


which in view of quasiconvexity yields 
f(x + (1 — Aju) < Max{ f(x), fw} <a, YA € [0,1]. 


: Consequently, E NU E Te VA EON, implying Ix is a convex set. 

l For converse, let x,u € S be such that f(x) < f(u). Then, x,u € T fu); By virtue of 
convexity of the set I, for every a € R, we get that, Ax+(1-—A)ueT gu), VA € [0, 1. 
Thereby giving 


f(x) < f(u) > f(Ax+(1-A)u) < fw), YA e [0,11 


Thus, f is a quasiconvex function on S. -n o 
Analogous result can be worked out for quasiconcave functions in terms of their 


i upper level sets. 


Corollary 12.2.1 Let S E R” be a conver set and f :S >R. The upper level sets of f 
are Eey Q,={xeS: f(x)za}, ae R. f is a quasiconcave function on S if and 
p= ; > 

| only if Qa is a convex set, for every a € R. 


j j j ve) function is 
The above characterization of quasiconvex (respectively, quasiconca ) 


= e R, the | 
ee te Be on fe) = function on R but f is both 
ind that f is neither a convex function nor a concave 


j hat for any 
l asily be verified t 
: : tion on R. It can ¢ = [x,co). Thus 
uasicon d quasiconcave func ae A and QO, = [x, ; ) 
ca 4 TPS 2 a<xt+l1, we have Ta << eo) Bee oaiiaiiel of convexity 
quasiconvexity (respectively quasiconcavity) ae 
44s ; 


_ (respectively, concavity). 


greatest integer function, we 













tion f is quasi- 

| follows. A func 

| j rpreted us level curves of 

ING 5 can be inte two leve 

aeon os t joining any two points ae e r value of f 
Nn tr Ae e : en ; e CVEN 
Snvex if and only if the line Se urve of f corresponding a as - (x) = a and a point 

f lies nowhere above the level cur ; the leve! € i 

. i : T $9.4, > point Xx lies on 4 fw) = X22. Here, A1, a2 are i y 


Y 9) PAm j IE € ‘ 2 
Av: Pers : s 5 ne 
_ > i f » 
B 4 A i 
z a 
; a J P ~{ y 
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fw) 





f(x) = M 


f(x) = 1 





Fig. 12.1. 


two real constants. Then any point on the line segment joining x and u is of the form 
AX + (1 —A)u, and one can see that f(A£ + (1 -— A)u) < a2, VA E [0, 1]. 

Similar interpretation can be used to describe Corollary 12.2.1, i.e. a function f is 
quasiconcave if and only if the line segment joining any two points on any two upper 
level curves of f lies nowhere below the upper level curve of ie corresponding to the 
minimum value of f. This statement is depicted in the second figure in Fig 12.1, 

At this point, one may wonder as to why so much of discussion is centered around the 
level sets? We pause here to answer this question from economic point of view. In fact 


one of the earliest applications of quasiconvexity / quasiconcavity appeared in economic 
theory. 


ere, by continuity of the utility function, we mean 
(which mathematically is represented 
the preference (or utility) level. The 

Most preferred alternative with the property 







ike to y 3 are at the same preference level. Here: 
IS maintain ed to the bundles are not important 
ty function s, 7,” the interpretation would remain 


tne 
*£L0,U100T T OG 
Te E O 
ef 


as, he Ui (x1) = 18, U;(x2) = 20 a 


=; = ne 
Lie Can, TEAS Me t YEN 
rogy ari moh, TS h A > PE. y W. 
oe’ UtUity level form a curve 
Me Mbp,  , 4 1) an 
FEI NI WV E tYTorirmin a : 
"WT uadh : y 
ba / 
4 Pa = 
EDA 4 


-A 


Bo T 
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-nedifference curve with utili 
on the 19 l » with utility level U = 
pan noe curve ith witty level Gag U = 4, but the bundle (2,3) is on 
D ihe collection of all the indifference curves, each with its assigned utility lev |, co 
a , POR E: a i AAR: 2 VEL, m- 
gae rep nt Es sipedon: f preferences. It is generally observed aa most com- 
used utility functiols hat the upper level sets of the indifference curves are 


monly ts. This is equivalent to the condition that the utility function representing 


onve® ia ference is a quasic ) 
she gamer pit" F q iconcave function. Therefore, in economic models, 
ek aniti issue about the function 1s not whether it is convex or concave but simply 
her its upper OF lower level sets are convex sets. 

Ret yrning back to our earlier discussion, we present some more properties of quasi- 

“onveX and quasiconcave functions. 
n 
corem 12.2.2 Let S C R” be an open conver set and f:S—7R be a differentiable 


, function on S. Then 
x,u ES, fŒ) <S f(u) > V f(u) (x =) 10; 


e implication holds trivially. Suppose xX + u. Since 5 is an open 


that Ns(u) C 5. Here, ô > 0 can þe 


Proof. jfx=u then th 
set, we can fnd a ball around u, say N5(u), such 
y small so that x ¢ N5(u). Let A e (0, i with A= 1 and define 


chosen sufficientl 
z= Axt i Aju. Then, x e Ns(u). Quasiconvexity of f at u yields 
f(x) = fx + (1 —A)u) < fu). 


Thus, 
fur A(x - u)) - f(u) <0. 


Using differentiability of f at u, we get, 


y fuy (x —u)+\\x- ulgu A(x - u)) <9, 


where B : R” X R” — Ris such that 


Boe w) = 0. 









hint, [lll 
Now, taking limit as A — 0, we get 
o 
O E 


———— : wa fl? 3 R be a differentiable 


> j É ty ne „E 
= 9) & dl Nyt ay j 
Ugg > , 
HOMON 2. =" bS à 
j A j’ > A ET. 
x , 
> 





open conver se 


s 


iable function f : S +R 
hat a twice differentiable H;(x) is a positive m an Mer 
Recall from Chapter Tt S if its Hessian matrix Ay : p ii AVE Seri b 
convex set S C R”, is opne cs have a second order characterization of 


n atrix. Paral n of ered Hessi 
“ D ga need to understand the co cept of bord essian 
i his, i 
function. For this, 


defined as follows 





Wagie “h 


hy 
i Br(x) of f 


0 ae 
B= yf H¢(x) 


ix as the Hessian matrix H¢(x) of f is bord 
d bordered Hessian matrix as er 
tered sora’ column from the top and the left, respectively, Thus, B sla) i 
r <a of A order (n + 1) x (n + 1). The principal minors of Pf hme — 
a mat 


a) e fa) 
O MO fr) ... fu) 
TO: ok. yf, &=1,...,n). 


fO KO 60 .. fe 





Here by fi (x) we mean me Note that for a function Ws 
principal minors, Dj (x), .. -,D, (x), of B p(X). 

For example, consider a function f : R2 
principal minors of its bordere 


of n variables there are n 


> R defined as f(x, y) = ye™. Then the 
d Hessian matrix B f(x) are 
















0 f 0 —x 
Di(x,y) = ae | -ye dint 
, = = aos = — e . 
f,(%,y) Foy) So ye y 
0 A ip (x, y ) 0 moa X =X 
Do(x,y) =| f Ue e 
2 1Y 1%, y) ae Ce y) E y) = sl e ~ mens —e*¥ = ye 
h(x, y) Ío (G y) AC i) y) Car —e-x 0 . 
We state below a char izati 
acte , ‘th- ff 
oot rization of a twice differentiable quasiconvex function with ' 
E TEYE ; 
eee ee L = be ii 
bes en-A necessary condition _ A sia and f:S>R be a twice differential” S k 
3 ; a -/N), y A Ee Pies SA 0 beqa quasiconver function on S is that D(x) S449 





F & x. 7 = 
, ey -S i -- A 
NOD 48 that n. 
Ne EL SIVT f 
~ Uf WU 4 





Dues...” Sufficient Condition for f to be a quasiconve 


1 a Hy) DY 
m) A 8 Fa AN 





w A 1 
jam o o e 9 e 
OE OAREN ETN 2 w 
f m Ye - rÈ 4 v TN 
A CLL ty lia { g Ul <0 VY Š r j E€ ) 
A a a, AG At A ; on 
f > Sar ae AE ə i tion 
ISS Aaa css, zex TUNnCuUY 

> a Guasiconvex I 
a A me) oe Lea gan eea 
> yer 


mN I < 


LION A i 2 ie B 
JE ion 4 J: ote À d , Sm... 
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12.2.3 Let S © R” be 
— A necessary condition for f to be 
Nee) = Op ser D(X) < 0, ifn is odd, and l 
p20) wh f z a be ' : = Of n is 
her nt condition for f to be a quasiconca | even, Vx e 


area) <0, 1 1 odd, and Du(2)>0 ifn is euen Yas 


for example, consider a function f(x, y) = xy, on the set S = . 
at (x,y) : x>0, y > 0}. 
1 xX; y = 4 ; = y 
y AES y) fi y) | y 0 - =y? < 0. 


‘ fl hey) ? 
ren =|fey firey fey |= 
hy fay) foo(X, y) 


Hence, f is a quasiconcave function on S. Observe that f is not concave on S as 


xo 
m. Oo 


= 2xy > Qon S. 


O m R 





1 1 1 
F (500.0) + 5@,2)) < 100+ fD 


Here, we may also note that the level curves of f, {(x,y) € S : f(x,y) =a}, a E€ R, 
are rectangular hyperbolae. So, the upper level sets of f, æy) ES : fy) zah ae 
convex sets, thereby, confirming the quasiconcavity of f on S. 


Theorem 12.2.4 Let S C R” be a polytope and f : S > R be a continuous function on 
S. Then f possesses a matima (respectively, minima) at an extreme point of S and in 
all its polyhedral subsets if and only if f is a quasiconver (respectively, quasiconcave) 


function on S. 
Proof. The proof of quasiconvex case is given. The quasiconcave case can be discusse 
— S. Since S is a polytope, it possesses finite 


Let f be a quasiconvex function on 
number of extreme points. Suppose pa ee i 
flx) = Max f(xi). Let x € intS. Then there j 

l<i<p 


are the extreme points of S such that 
Ap such that 









p 
j Aj=1 
ime fs 
x= Airs A;20 @ 2, 
i=1 j 
the definition of quasiconvexity of f, we 8 
€ | 


E ) ie oo Max f Xj). 
ali ee S xj) ~ 4<i<p i 
ee a gaa 


p 
x ) > 4 


$ Sry . = $ = = $ 3 <- 2 
A wal J i 
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m O 4 oint of 5, 
Thus, the maxima of f is attained at an extrem p paises 


Conversely, suppose f ; i „sment joini . all ; 
in particular, a line segment joining two points Xu Itg 


à s, Then, 

polyhedral subsets. ry ella y. 
ki ' ossesses its maxima at an extreme na; i8 4 
bounded polyhedral subset of S, 50, f P Point Whig 


in this case, is either X or U. Thus, 


f(Ax + (1-A)u) < Max{ f(x), fh} VA € [0,1]. 


448 Numerical ( 


Consequently, f is a quasiconvex function on a. 
We now illustrate the above theorem by an example. 


Example 12.2.1 Obtain the optimal solution of the following nonlinear optim 


problem ization 


Max f(x,y) =x + xy 
subject to 
a Vez 
MESU = — 3 
BEN 


Solution The feasible set S is a convex polytope with extreme points (0, —1) b 7 
7 } 2 , a 


=1 
and ( 7” >}: Also, f is a quasiconvex function on S because 


0 fr 




























D} (x, y = , $ 1 = 0 3x2 + y 
Ih (x, y) Veg les y) 3x2 + y 6x = (Or F y)? <0on S. 
0 if / 
D = , > 1 h (x, y) 
m= h i) E acy) |= ee y g 
7 ; x 
h (x, y) i (x, y) ies y) x y 1 
Next = 2xy <0 on S. 
» hote that for any (x,y) € S 
O<x< 1 a 1 
a) nd ge < yaus ; 
I E 
ina 2 War +x Sa gl 
] an d hen € Kae ysr = ALOS = 
ee en ee optima Solution ig y* — (0 1) 2 f(0,-1), . 
sie ng problems. ult that will be. used in th aby yreine point of S. 
ee AEREA e subsequent section on fractional 


P | --<{ T i a E 
CRD: MOr Cm run . 
e EE S- C nhs 


à 
® ws 
i , æ 
— 1, t "I 





+ W CONIIOm i 
2 conver set ç 


oy ann hi 
a url U Th 


| 
fe ee od 
SEO SR, g(x) 40, Ve S. Defi? | z 







if en. Or: z TY == Ara od 5 if 
Mere Es 3 9TH On mAaiataAra Loan Y 
~ Tasiconver function on 
ees 
eens 


ae Er ALE ~- ye” ao 
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x Functions in l 
fis conver on S; g ts linear on S, ae d Fractional Programming 449 
ris concave on S, g ws linear on S, g(x) 4 DE: E S. 
(i) and g are conver on S, f(x) < 0, g(x) > ce E€ S, 
ie f are concave on o f(x) > 0, g(x) < ; 7 ES. 
is concave on $, Sii conver on S, F(x) En XES, 
Pe os gs cease on N EET 
i , XES. 


foi) 
f. We shall be proving the theorem for the cas 
e cases follow on the similar lines. case 


of th 
Consider the level sets of @ as 


(1), case (iii) and case (v). The rest 


la ={x ES : O(x) <a}, ER. 
suppose condition (i) is true. Then, 
Po ={x ES : f(x) —ag(x) <0}, a ER. 


As f is a convex function and g is a linear function, so, for any a E R, f — ag i 
convex function on S. Thus, Iq is the zero-level set of he corner function . ji — : 
hence a convex set. The level set characterization of quasiconvexity (Theorem 19 2.1) 
implies that @ is a quasiconvex function on S. 5r 

If case (iii) holds then 


Ey =e Ge f(x) — ag(x) <0}, a ER. 


For a < 0, f — ag is a convex function on 5. 
Let a > 0. By assumption, 


O <Q} eke gene wy YxES. 
g(x) g(x) 
1 set of a convex function f — ag and for 


is a zero-leve 
@ is a quasiconvex 


Thus, for a < 0, the level set T 
Tq is a convex set. Thus, 


a>0, Ta is the whole of S. In both cases, 
function on S. 


Inc 
k case (v), Paes: f(x) - ag(x) 2 0}, aER. 


is a zero-upper level set of a concave 







| function, hence a convex set. Let 


i -a AGA >0, Vx ES: 
A g(x) 


J al 
à F . 
1. convention 
See eet Dy | onvel 10L. 
( i A DUYU ne =~ 3 R à 
-4 al p T 


nY 
[i 
[i 
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siconcave function. Q 
d proved for quasi 
nalogous result can weer “he S & R" is a convex set, is called 
function J : 


Recall that a d only if it is both convex and donoe ara i 9. Thus t) 
function on S if and only E a linear function on R. From the above discussion ; 
Tya xc ER”, aeR is a te 
a thot the ratio of two linear functions 
0 
ae xeScR”, c dER", a, BER, 
d?x +B’ 


Applications 
et making 0 a quasiconye 
oe * functi 


In both cases (i.e. a2 0, 


linea 





Q(x) = 


with d’x+B > 0, Yx € Sord'’x+B <0, Yx € S, is both a quasiconvex and a quasiconcaye 
function on S. We shall be needing this result in the later part of this chapter when - 
shall be studying linear fractional programming problems. 

We like to point out here that a function f of the type f(x) =c’x+a, xes CR" o¢ 
R",a € R, is sometimes termed as an affine function in literature. This is because in 
linear algebra, linearity of the function is described by the relation 


f(ux + du) = uf (x) + of(u), Yu, ER, Vx ues. 


have been calling, f(x) = cTr4q 


optimization that it is both conv 
ex and Ah 
the following concent. concave function. In the same spirit we define 


Definition 122.3 (Quasili 
‘áis as ° 
R. Then f ts called a quasili Let SCR" be g convex set and f : S > 


function on S t is both quasiconver and quasiconcave 
For Instance, f( 
WIDE, ve 
| mc Boe inear function R OP ther atio of two linear functions @ described above 
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an s 
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flu) < f(x), Vee Nalu) A sS and f(y) < f(u) 


M i 
o py quasiconvexity of f; 

t 

| 

Ay+(1-A 

ar fay )u) < f(u), Y 
' f(u), VA e [0,1], (12.1) 
re ¢ Ns(u). Choose A € o, HE i 

| _— y= ujj) EA < 1. Then, Ay+ (1 — Ayu € Nsw). 

| _we obtain 

| “a J) < fay + (1 - Ayu). (12.2) 

| (12.1) and or are cn compatible with each other. Consequently, u is a strict 
* ` gobalmin point of f on S. á 
e ? 

Corollary 12.2.4 Let S C R” bea conver set and f : S — R be a quasiconcave function 

5 an 5. Then every strict local maz point of f is a strict global max point of f on S. 


The above results fail if the phrase ‘strict’ is dropped. In fact a local min point 
(respectively, local max point) of a quasiconvex (respectively, quasiconcave) function is 
not necessarily its point of global min (respectively, global max). Fig 12.2 provides a 


graphical illustration of this statement. 


l f(x) f (x) 


Fig. 12.2. 


j in Fig 12.2 is a quasiconvex 
O ei “= u is ’ local min point of f 
in Fig 12.2, the function 1s 
ts. Also, u is a local max 
he function. 
sion of quasicon- 
ively, quasicon- 


The upside down podium like functi Here, p 
function as its level sets are convex intervals. a figure 

_ but it is not a global min point of ce are convex se 

| x 4 quasiconcave function as its UPP® level eph obal max point of t 

= Point of the fun ction but obviously i to introduce a stricter ver ‘ 

1e above example(s) indicates alled strictly quasiconvex (respec 

ctivel A a Løglobal extremum property: 



















. 
actively. quasicolve 
= = = Ith — 
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iction). Let S C R” 
iconvex Function) 4 " be cg 
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as 
» 4 (Strictly Qu 
ea yh f is called a sir 
J i È a 7 


n” +u f(x) < f(u) => f(Ax f (1 and A)u) < f(u), VA E (0, 1), 
x, uéo, x INN Ba 


tly Quasiconcave Function). Let S cR” 
1 strictly quasiconcave function on § if 


and 


ition 12.2.5 (Stric 
ney? S= R. Then f is called a 


neS xu J 2 fly) S Ax + (1-1A)u) > fu), YAEO?1). 
X, j , 


From the above definitions, we can easily observe the following 


Ye Convery Set 


(i) f is strictly quasiconcave on S if and only if —f is - he. ih ae on S. 
(ii) A strictly quasiconvex (respectively, quasiconcave) p yS a quasiconye, 
function on S but the converse, in general, is not true. For example, f(x) = Zl gee R 
the greatest integer function, is both quasiconvex and quasicongaye function on R 
but neither strictly quasiconvex nor strictly quasiconcave function on R. 
(iii) Sum of two strictly quasiconvex (respectively, quasiconcave) functions 


strictly quasiconvex (respectively, quasiconcave) function on S. To see th 
R > R be defined as 


2 
OR o hosii XO 


Xa (): oO X 


is, let fy f ; 


FO) fila) + fe) = 15 * x<0 


x10) 


-+q are upside down ri ? 
f is a quasiconyey p -Eht angled with 


ti N function on R? but it 
eO u= (1 iD) ee 

sie p l i ), A i f(x) = j (u 
E AA 


~ that the | Te 
At Wh — È aoe Di i o : E sr i eve f - 
we Would like to cay ee aor f 






“Sa, Y Sa), AER. 
vertices on 


f 
is “Yay from origin (see, Fig 12.3) | 
3 A a TaN quasiconvex function as for 

» Dut fia 

i TEDER) 
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Fig. 12.3. 


; S i 
Theorem 12.2 7 oy S R” be a conver set and f : S — R be a strictly quasico 
ieee suricily s CONGUE ) function on S. Then every local ri vote mesa 
tively, local Mar point) of f is its unique global min point (respectively, global maz point) 


on S. 
Proof. Let u € S be a local min point of f. Then there exists 6 > 0 such that 


fu) s ION AEN NS: 


es a 


Let u be not the unique global min point of f on S. Then there exists y € 5, y + u 
such that f(y) < f(u). | 


By strict quasiconvexity of f, 


f(Ay + (1—-A)u) < f(u), VAEO, 1). (12.3) 


te N 
-Dedin ii cap Cs 


Now, we choose a À ejo, eae wh A <i Then, Aye (l= Aju € N5(u) N S. Thus 
lly — xl l 
f(u) < fly + (1 - Au). 
(u) < OOE S, yielding 
- m) 
quasiconcave) function is very 
(respectively, maximization) al- 
n the objective func- 


But this contradicts (12.3). Therefore, we must have f 
the required result. 

This property of strictly quasiconvex (respectively, 
useful particularly for the unconstrained minimization 


| gori thms which do not impose any differentiability condition 0 l 
of the function ensures 












tion. In fact strict quasiconvexity (respectively, quasiconcavity) l 
modular rtainty within which the optimal solution for 


ts unimodularity in the interval of unce 


oe T | ait technique or golden section 
jective function is searched, like in ch q 8 


Fibonacci sear 


E Ji l ene oe e > i i f - 

Rei quasico nvexity or quasiconcavity and their strict gen 

ie ae, fo be differentiable m the domain. In many 
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belonging to these classe 


fnd that many well known numerica] opti 


cases, functions ‘at of view, We j āti 

the algorithmic poini atier optimization problem, Min f(x), are gradient p hi 

techniques for the unco of f. If we have a prior knowledge that the object, 
i lVe 


search only the -i AS then the stationary point of ji Is definitely its gloh 
PN Ask kr instead of convexity we check the applicability of this Dro 
minimizer. ail of f is the global minimizer of the unconstrained Optimizat; 
(i.e. the stationary k: r the differentiable strictly quasiconvex function, then we find thar 


problem, Min f (x)) r 
the same Ein fails to hold. For example, consider f(x) = x°, x € R. Then Ų f(0) = 


but x = 0 is a point of inflexion of f. Note that f is a differentiable strictly quasiconyey 
function on R. Thus, even if we know that the objective function f in an “1COnstrained 
minimization problem is differentiable strictly quasiconvex, the gradient baseq Search 
technique will only provide a stationary point of f but the nature of the Stationary Doint 
can not be judged from the information of strict quasiconvexity. 

These observations lead us to introduce two new classes of functions that Share an 


important property with convex functions that the stationary points are the global min 
points. We describe these functions in the next section. 


12.3 Pseudoconvex and Pseudoconcave Functions 


ri 12.3.1 (Pseudoconvex Function). Let S C R” be an open set and f : 
- Lhen f is called q pseudoconver function on S if it is differentiable and pii 
AEA gi N f(x) = fw. 
Equivalently, 
HES, 1) fry VF =u) <9 









be an o t and f : 
cave fi m 
function on S if it is differentiable and 


X, UES Vu)? (x — usos ESEA f(u) 


VF)" — 1) > 0, 
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If f js a aE Nig pectively, concave) functio 

i vex (respec shai pseudoconcave) function on S. T} i pala 

is not necessarily true. To see this, consider I) = ri ii reverse of this statement 

nvex function as f (u)(x =u) < 0=> XSuDpxrzy j 3 E€ R. Then f is a pseu- 

iot a convex function on R as f”(x) = 6x which is not pirate: jé < f(u), but 

Am of two pseudoconvex (respectively, poeudocoucaie) CaN Á ax ph y R. 

pseudoconvex (respectively, pseudoconcave) function. Take fa) nog oe 7 > 

„XE R. Then fi and f2 are pseudoconvex functions on R but F(X) = f EA f (x) =x 

is not à pseudoconvex function on R as for x : po D 


a 


=1u=0, f (u(x — u) = 0 but 

AR iF 

iv) (a) Every oe sees f a pseudoconvex (respectively, pseudoconcave) function 

is its global min pom (respectively, max point). In means that if f is a pseudoconvex 

l (respectively, pseudoconcave) function at u € S and V f(u) = 0 then u is global min 

point (respectively, max point) of f on S. For example, consider a function f : R > R 
defined by 


i Ca Ne ae. 
eg EI aa 


then f is a differentiable function with 


E x<2 


LoL. 


By taking three different cases, viz., u > 2; 1 < u < 2;u < 1, it can easily be verified 
that f (u)(x — u) 2 0 > f(x) = fu), xER. Thus, f is a pseudoconvex function on 
R. Also, x = 1 is the only stationary point of f which is a global min point of f on 

(iv) D As a consequence of (iv) (a), we can assert that if a differentiable shape’ f 
possesses a stationary point u € S which is not a global min point ae y, glo y 
max point) of f then f can not be a pseudoconvex a pseu ae 
function. To justify it, let f(x) =sinx, -7 <x < n. f is not a pseudoconvex 


= ™) Here, take a note that f 
on [—7, 7] as, f GE aae 0 for x = 0 but f(x) < f(§). Her 


has two stationary points “1 = -5 and uz = 5 but u2 is not the global min point of 


í i j function f can 
(v) Geometrically, pseudoconvexity (respectively, pseudoconcavity) of a f 


aay. point u in the 

be interpreted as follows. If the directional ta r fe tae ioe 
direction (x — u) is nonnegative (respectively, ra ut = 
NA T : : i on. 
aw i ; ng) in that direct 
nondecreasing (respectively nonincreas} r ; 
a g (resp or <onificant from the computational view 

‘he last observation along with (iv)(@ to be a 
in eed es tie r | 2, : o functions to De 

Bier E CDT jtions for the ratio of tw 
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n open set an a 
whoorem 12.8.1 Let S GR" be 0" 7P tion 0: S > R defined as 0(x) = £@) tion, É 
E ee ae an me llowing conditions hold Ae) “eg ak 
function at U € S if any of the following , | 
is a pseudoconver 
) f is conver at u, 8 is linear on S, g(x) > 0, fi wi 
fi) f ìs concave atu, 8 is linear on S, 8(%) < i a : | : 
(iti) f and g are conver at u, f(x) < 0, g(x) ee) $ ; 3 R 
(iv) f and g are concave at U, f(x) 20, g(x) <0, Byres 
(v) f is concave at u, 8 is conver at u, f(x) <0, g(x) <9, i | ` 
(vi) f is conver at u, g is concave at U, f(x) 20, o(x) O, Vres. i 
a 


Proof. We shall be proving only the case (iii) and case (v) of the above stated conditions 
The rest of the cases can be discussed using the same ideas. 
Suppose case (iii) holds. Let x, u € S. Using convexity of f and ¢ at u, we get lin 


f(x) - f(u) > V f(u) (x — u) 


fu 
(x) - glu) > Vg(u)"(x — u). 
oan the first inequality by g(u) > 0 and the second inequality by flu) < 0, and | a 
| 
fdg) — f(u)g(x) = (g(u)V f(u) — f(u)Vg(u))" (x — u) | TI 
= (SU) (V= (u))" (x — u) | we 
= 2 T P. ( 
j Now, suppose poan T: a 
VO(u)! (x — “w>Os GUVOH x = 0 
> F)g(u) O 
i f(x) : fai) g(x) 
Set 
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adeat of the arguments are the same as in the latt 
| er part of the 
he case 


(iii) discussed 
g 


hove 
abr" jogous results can be stated and Proved for pse 


from the above discussion it is clear that a fi 
j ine 


n : i fe 
Ree oe vith e+ 8 > 0, Veeg on at 
pseudoconvex and pseudoconcave function on S ) “+B <0, Vx eS, is both 


udoconcaye function 


| T 
tion O(x) = Tex, / vere 
4 





nition 12.3.3 (Pseudolinear Functi 
p pseudolinear function on S if it is both a 
function on S. 


on). A function f : S CR" — R is called 
pseudoconver function, and a pseudoconcave 


For example, the ratio of the two linear func 
near function on S 

We shift our attention to explore the relationshi 
functions and the class of quasiconvex functions. 

We have already seen that the function f(x) = x3, x € R, is a quasiconvex function on 
R but not a pseudoconvex function on R. Thus, we can easily assert that a quasiconvex 
fmnction need not be a pseudoconvex function. We shall now try to answer the converse. 


tions described above as Ø is a pseudo- 


p between the class of pseudoconvex 


Theorem 12.3.2 Let S C R” be an open conver set and f : S > R be a pseudoconvez 
function on S. Then f is also a quasiconvex function on S. 


Proof. Let x,u E S be such that f(x) < f(u). In order to prove that f is a quasiconvex 
function we need to prove that 
f(Ax+(1-A)u) s flu), VA€ (0,1). 
Contrary to this, suppose there exists À € (0,1) such that 
f(x +- Āu) > fu) (12.4) 
Set = Ax + (1—A)u. Then, f(¥) > f(u), which on using pseudoconvexity of f yields 


V f(z) (U-%) <0. 


baits quently, we obtain 


=š = —(1-A)(u—x) and u - ¥ = A(u — x). Conse 


gal 







iis E <0 
TTA iis: ig (1-A) f 


T TOTP tagt 
a SE a 


a 


(zx). This along with (12.4) implies 
 —_———— . ul J 
b i - y pge s 4 


i a P u 


Scanned by CamScanner 






plications 
f(x) > f) 


othesis. Thus, 
)< f(u), VA € [0,1], 





ets the assumed hyp 


which contradi 
fs fu) > f(Ax + (1 —A)u 


quasiconvex function on S. 
S C R” be an open conver set and f : 53 R bea differen 
T Then f 28 also a quasiconcave function on S. 


giving f is a 
Corollary 12.3.1 Let 
pseudoconcave function on 
2 A pseudolinear function f:5—7R on 


Hable 


Corollary 12.3 S is a quasilinear functi onon | 
S 


We now introduce two more classes of functions, namely, strictly pseudoconvex fune- 
tions and strictly pseudoconcave functions. 
Definition 12.3.4 (Strictly Pseudoconvex Function). Let S C R” be an OPEN sej 


and f : S > R be a differentiable function on S. Then f is called a strictly pseudoconyer 
function on S if 


uE S x#u, VAO e uz 0> f(x) > f(u). 


Equivalently, 
x,u E€ S, x#u, f(x) < f(u) > Vf(u)! (x -— u) < 0. 


Definition 12.3.5 (Strictly Pseudoconcave Function). Let S C R” be an open set 


andf:S >Rbeadi 
n e a differentiable function on S. Then f is called a strictly pseudoconcave 


sM ES XFU VACO — u) <0= f(x) < f(u). | 
Equivalently, 







“it X u E S, x#u, F) > fu) > Vu)? (x — u) > 0. | 
t is important to take note of the following 


(i) A stricth 
trictly pseudox 
y p : udoconvex (respectively, pseudoco : . p- 
ectively, pseudoconcave) functi acave) function on S is a pseudo?” 
lleva A Consider a function f: Hae on S. But the reverse implication ? 
Pseudoconvex function on R2 as ata R given by f(x y= 8 +x-y f BE 
-“21CULY pseudoco: nvex ain L does not possess a ouy oint, but f is pot 4 
ee Chon on R2a.¢ a. point, 
x= (1,1), u=(1,-1), x #u, f=! 
© CT a Convex function nor 4 conc 


ey ox 0 aa aoo 
Bia » is neither a P® 
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observe that if u is a stationary point of a strictly pseudoconvex (respec 
ii ‘eit pseudoconcave) function, i.e. V f(u) = 0, then u is the unique global minimizer 
: actively: maximizer) of f on S. This means to say that the graph of a strictly 
oe doconvex (respectively, pseudoconcave) function can not have more than one 
ee y (respectively, ‘peak’) with the same lowest depth (respectively, height). 


phet 
4 rictly quasiconvex 
gheo! 
function on 

f. Suppose f is not a strictly quasiconvex function. Then there exist xu ES, X FU 
in f(x) < f(u) and A € (0,1) such that 


f(Ax + (1 —A)u) = f(u). 


Writing ¥ = Ax+(1-A)u, we have, x + u and f(u) < f(x). Using strict pseudoconvexity 
of f, it follows that 


heorem below, we shall see that every strictly pseudoconvex function is a 
function. 


em 12.3.3 Let S C R” be an open set and f:S >R bea strictly pseudoconver 
S. Then f is a strictly quasiconvex function on 9 


V f(z)" (u — 7) < 0. 





À 
=i —(x — ï), hence, 
Now, u— X eis ) 


V(X) (x -*) > 0, x +z, 


which, again by strict pseudoconvexity implies f(x) > f(*), giving, f (x) > f(u). This is 
contrary to the assumption that f(x) < f(u). Therefore, f must be a strictly quasiconvex 
function on S. 0 
Corollary 12.3.3 Let SCR" be an open set and f : S > R be a strictly pseudoconcave 
function on S. Then f is a strictly quasiconcave function on Si 


Relationship between convexity and its various generalizations 1s summarized as 


a strict, convexity => con exity 


(under differentiability) (under differentiability ) 


‘strict pseudoconvexity => pseudoconvexity 


-> 


~t GQuasiconve? 
` — Maa P rga A 


ity => quasiconvexity 


— 
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We now turn our attention t 


“mality con 
t to the optimality co é 
ae „d nonlinear opti 


call the constraine 
j Min f(x) 
subject to 
i < 1 = i "gle ad 7 m), 
gi(x) <0 ( (12.5) 


f:R" >R, gi: R" >R (i=1,...,m) are differentiable functions on R”, 

We have seen in Chapter 8 that the KKT optimality conditions, which Otherwise 
only the necessary optimality conditions for a point x* € R” to be an optima] soluti P 
of (12.5), also become ‘sufficient optimality conditions’ under the convexity hypothesi. 
Here, we shall be investigating whether the convexity assumption can be replaced ty 
the weaker assumption of generalized convexity. 


Theorem 12.3.4 Let x* be a feasible solution of problem (12.5), and there exists )* 
R” such that the following KKT optimality conditions hold i 


Vf (x*) + AX Ve(x*)=0 
Aceh — 0 
A O: 
Let I(x") = {i : gi(x*) = 0} be the set of active c 
; onstraints at x*. Suppose g; i It 
are quasiconvex functions and f is See w) 
solution of (12.5). f is a pseudoconver function. Then x* tS an optimal 


Proof. Let x be any other feasible solution of (12.5). Then 


gi) <0 =g) ie TOR); 


Using quasiconvexity hypothesis and A* > 0, we get 


( A A; Vgi(x*))' (x =) < (. 


i icI(x*) 
ow, fori ¢T(x*) ar = 0 
, AF =0. Th 

us, the above mequality can be rewritten as 


(AV) (x — a8 
which along with the first equat d AAS 







lon of the KKT conditions implies 
VFO) (x -= a) s 0 
f Zori 
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AUS, X* is an optimal solution % 
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Some Gene 


yé fractional Programming Problems 


„ section, We shall be studying an im 


he namely, fractional programming pr 
Jen?» 


Portant class 
oblems, descri 


of nonlinear 
bed as follows 


programming prob- 


Min f(x) 


——— 


g(x) 


subject to xe S (12.6) 


where S is prescribed by finite number of constraint functions LeS {xe R 
pe) = 0 @ L.m), and f, & hy : R” R 0S 
X 


= 1,...,m). In order to have the 


ratio function g(x) well defined, we assume that, g(x) > 0, Vx e€ S. For the case when 


a(t) < 0, Yx € 5, we consider the objective ratio of the form 





instead. 


It is important to note that the assumption on the sign of g function is natural. In 
fact, in practice g is generally a continuous function, hence, if g changes its sign in the 
feasible set S then it will take the value zero at least once in S. In other words, if there 
exist X1, X2 E S with g(x1) > 0 and g(x2) < 0, then by continuity of g, there exists ĉ € S 


Te a 


; } xX) . 
such that @(£) = 0. In such a scenario, the ratio E is not appropriately defined on S. 


If all the functions f, g, hi (i = 1,...,m), involved in problem (12.6) are linear func- 
tions then the fractional programming problem is called the linear fractional program- 
ming problem (LFPP) else it is called the nonlinear fractional programming problem. 
Before proceeding with the theoretical developments and solution methods related to 
the fractional programs, we wish to outline some applications of fractional programming 
problems. 

During the modeling of many industrial problems, like a stock cutting problem, it 
has been found useful to consider minimizing the ratio of the waste material and the 
used amount of raw material instead of simply minimizing the waste material alone. This 
type of formulation leads to linear fractional programming problems. On the other hand, 
in problems of resources allocation, it is obviously beneficial to consider maximization 
of the profit /cost or profit/revenue ratio rather than just maximizing the profit. In 
Portfolio optimization problems too maximizing the return Over risk ratio 1s Sh gd 
leading to optimizing a linear over quadratic ratio. Many stochastic processes, like, 


Dering: t of an item, where the 
Periodic revi : items or maintenance and replacemen' ; 
meee) ventory Hemp d replacement of an item and the expected 


account, involves minimization of the cost 


© time ratio. A common problem in matrix theory 1s finding the largest eigenvalue of 


two quadratic functions. In 

ee ENE the ratio of two qu 

“1 Matrix W. ich is- sically maxu mg r : j 
ee ee Me s bas ei ia munication channel is generally described 





Ane hnt : : ; j 
ae petween two inspections is taken into 
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and Cooper Algorithm 


“Ctiona) } crane 
; j this algorithm, a variable transforn 


lation ig yer 
programming problem while used 


to convert problem (12.7) into 


i k : preserving the ger 
i | 3 ze, T} l ' Beometry of t} y 
mming | ting dix+ B= a The problem can be rewritten as le problem. For this, 
hough . subject to xES 
nlinea; | | 
; i "a 
US to Using à transformation y = wx, the above problem becom 
Max Zz = cly + aw j 
' foung subject to 
dition 
l A —bw=0 
d thus AY tpo (12.8) 
y,w = Q. 
S case, 
‘It may be noted here that the transformed problem (12.8) is a linear programming 
| -g with an additional variable and one extra constraint as compared to problem 
| (7) 
À 
_ The feasible solution space of problem (12.8) is denoted by S! = {((y,w) : Ay -bw = 
| 0, dy + fu = 1, y= 0, w = 0}. We next prove some results regarding the two feasible 
ooper sets S and S`. 
ming 
Result 12.5.1 If (y, w) € S! then w > 0. 
Proof. Suppose (y, 0) € S!. Then, Ay = 0 and d'y = 1. Therefore, y + 0. Let x € S, and 


4> 0 be an arbitrary real number. Then 


12.7) A(x + uy) = Ax + uAy = b, 
x+ uy > 0, with atleast one positive component. 


npty Thus, x + nye S, Now =O: This contradicts the boundedness assumption on 5 
idity Consequently, w>0, V(y,w) € Sh, 
aud | Result 12.5.2 There is one to one correspondence between the feasible solutions of 







nett. 
DLL <1 i) 


E | (12.1) and (12.8) with equal objective values. 


D th 
Lr A 
; IC 


a= 1 ly, if 
oo ae 1 — wrx. Then, (y,w) € S-. Conversely, 
F Let xe S. Set w = Txt y wX , y 


> 0 and 





UMATE oha ne ult > 0 and hence, x= Y € S. 
= 5+ phan thA ia us result wW wW 
hen by the pre vious res ) and (12.8) are equal at the two 
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crte chyt+aw. Q 
d'x +p 
There is one to one correspondence between the extreme points Of the 
re | : 
Result 12.5.3 ¢ 
sets S and si, 


sponding feasible point of (12.2 
4rame point of S. The corresponc 8) 
' Let x € S be an extreme pou 
Proof. l 


iS y=w x, W= m ‘ t sate ; 
| } k i yal an extreme point of S' then there exist two distinct point, 
If (y, w) E IS NOL ¿ 


LAEL 
(yj, W1) € S' and (y2, W2) € S1 such that for some À, 
(y, w) = A(y1,W1) + (1 — A)(Y2, w2) 
Y UA 
=> = = 
= Axy + (1 —A)x2, 


=> xX 


" 2 
where <1 = 1) 1 anda, = Les PERAE P. 
w w1 W2 


This implies that x is not an extreme point of S, leading to a contradiction. Hence, 
(y,w) must be an extreme point of S!. The converse of the proof, i.e. if (y,w) € S! is 


an extreme point of S! then the corresponding feasible point x = ~ e€ S is an extreme 
point of S, can be derived on the same lines. a 0 


Result 12.5.4 There is one 


to one correspondence between the optimal solutions of 
problems (12.7) and (12.8) hatha 
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Max Ea eel 
aubject lo a x2 + ] 
“1 +42 $1 
Xi + 2x9 s] 
f Xi XQ 2 0, 
(12.8) 
1 Note that Xi + Xy + 1 > 0, for oe 
golutior all (X1, x2) fi asible for the problemn, Applying 
l Charnes and Cooper transformation, to Vie l ) 
Points i m ; XX + | ONG V1 = wY, Yo = wx, the 
. pla becomes 
. i Max z= y) 
l subject to 
| y1 +y: = wigi 
yı +2 =w s0 
YirtYotwe 
Yi; y2, W = 0. 
introduce the slack variables s1,82 > 0, the problem reduces to 
Max Zz = 1 
Hence subject to 
e si $ Vit Yo-W +s, =0 
Vi + 2Y2 = W + 52 = 0 
Vitytwe=l 
5 Vi; Y2, W, S1, 82 2 0. 
tons of Solving the above problem by the two phase method, the optimal table is given by 
l 2) (w) ye (82) 
xtreme pee aya i! ie EEA 
K | 0 1/2 
F (12.7) yy = 1/2 pg A 
4 that Sr OA a VO. GL ee . 
say 2. | w=1/2 () 1-1/2 
| | ) de 20 
p point —_— 
2.8) 


timal solution y* = (3,0) and w* = T 


ale 
qT [g re, the optimal value is 4 with the opt y* 
= a = 1 X = 
w* 


is x} 
Us, th he op timal solution of the original linear fractional program 18 %4 
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orithm for | | 

x Alg ve function of (12.7) 1S5 a quasiconcaya functio 
he optimal solution of (12.7) is obtaineq at ay 
le fact, we study the simplex algo. 
extreme point Ot >. 12.7) without restorl ble aang: The simplex 
rithm for problem (12. ming problem has already been exp/ained in detail in the 
algorithm for linear program e that the readers are familiar and wel] versed 


of this book. We assum om | 
ro re oe and interpretations of all the concepts used in linear Programming 
with the 


We carry forward the same notations while describing the simplex algorithm 
are eine many steps in the present algorithm follows on the similar lines, 
= without going into the detailed reasoning We present only their ont | 

“We start with an assumption ble solution (b.f.s.) with basis B jg 


that a basic feasl 
available. Therefore, XB = B-!b. Set 


Simple | 
ved that the objectl 

so, t | 
and the feasible set 9 1 Gena 2 ieee 
ing to any varla 


1 
VN = ChXB +a, Vp =a4pxp t B. 
The value of the objective function at xg is Z = oe 


Let a7) be a column in a matrix A that is not in B. Then, since B is a basis, a!) can 
be expressed as a linear combination of the columns of B matrix, i.e. 


m 
aĵ) = By; = > yb. 
al 


l= AR ee eye 
Set Z; = CRY jy Z} = de Yj. 
| We first find the condition that ensures that the current b.f.s. can be improved to 
jii get another b.f.s. with improved objective value. 


Meo 2 A a 4 "ES obtained ea B by replacing a column b”) of B by qÙ), 
i i ns ot D are given by b?) = a0) pÒ = pÒ q : 
XB, 18 chosen according to the minimum ratio tes : bY = bÒ (i+ r). The leaving variable 


0;=Mind Bi . 1, XB, 
i nt Hj > | a Y (say). 


The new basic variables ïg, are given by 







B: = XB; — O;Yij =1,...,m Er) 





: | 
i 
i a! Bá s 
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z2 
and V= TANN 


ncave use here to again emphasize that t] Ai 

e the detaile : 
ob tained | Me explained m Chapter 3. detailed working of all the above expres- 
> simple, an The objective function value will strictly improve if z > Z, i. 
1. The g: , Le. 

+ Plex y 
à detail in th Vn _ Vn 
nd wel] z € = Vn 0, 
program Ad ia 
lex algorithm, which on simplification yields 
Similar J; 
e. lines, | 0{Vn(z -d — Lae 
j\YN(Z; — dj) — Volz; —c;)} > 0 
T You! 

ith basis B | 


Letting Aj = Vn(z} — 4)) - Volz — cj), we obtain, 6;A; > 0. 

For Oj > 0, the improvement in the objective value z is possible if 4; > 0. Thus, the 
current b.f.s. is optimal for (12.7) if A j <0, for all those variables x; which are currently 
nonbasic variables. Note that for basic variables, we already have, A j = 0. Thus, the 
optimality criteria (for maximization problem (12.7)) is that A Lof = iat: 

Suppose 0; = 0. It mean xg, = 0 and Z = z. This implies that there is no improvement 
in the objective function value. Here xg, which is a basic variable with value zero, 
becomes a nonbasic variable and the new variable xg, becomes a basic variable with 
value zero, the values of all the other variables remain the same, leading to the situation 
of degenerate b.f.s. In this case, the basis changes but the corresponding extreme points 
remains the same. The case of degeneracy in b.f.s. has already been discussed in Chapter 
B» 

The above discussion is summarized in the following theorem. 


asis, ql) can 


improved to | 
Theorem 12.5.1 Jf all A j <9 then the current b.f.s. xg is an optimal solution of the 


of B by qt). linear fractional programming problem (12.7). 
ving variable Remark 12.5.1 If some A A> 0 and for that yij < 0, Yi, then the linear fractional 


programming problem (12.7) has unbounded solution, which contradicts the boundedness 
of the feasible set S. 


T 

CAFTA . r 
Remark 12.5.2 The objective function of (12:71), maa , is a pseudolinear function, 
the b.f.s. xp with all A; <0 


| hence, a local optimizer is a global optimizer. Consequently, 
| the global optimal solution of problem (LAGE 
“haa 









ae j ing problem 
The mechanism of the simplex algorithm for linear fractional programming p 
ow illustrated through an example. . 
ia ; 
aS E 


—_ > 


RJ 
Y = 
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Example 12.5.2 Solve the following “ Y the sn ' o 
1 biel 
5 = ro 
ai Mos e l a 
subject to 4 sing 
Xi + X2 <1 of n or 
Xi T 2X2 S 1 aneti 
x1, X2 Z 0. progr 
Note 
> 0, the problem reduces to , 
he slack variables $1, 52 T 
Solution Introducing t 
X Wern 
Mex aA tor] a gen 
subject to 
i) ap oo) ao Silas 1 
Rap 7220) S = 1 
X1, X2, $1, $2 2 Ô. 
Solving the problem by the simplex method we get following tableaus y 
y” y” yS 1) y» D 
= a il 1 1 0 R 
So = 1 i 2 0 1 Pech 
Za — Cj -1 0 0 0 7 
Z, -d j -1 -1 0 0 witł 
ZB = 0, A; > 1 0 0 0 
T 
yD O 
x, =1 1 il 1 0 
so = 0 0 1 zil 1 wh 
era C j () 1 1 0 Me 
A a 0 0 l 0 pre 
743 = WP? Á- => () -9 Ei 0 
, d 
“ Since Aj < 0, Vj, thus the optimality criteria is satisfied. The optimal solution 0 
e given re is a =i Ki = 0 and the optimal value is 5. ia; 
R 
B a ctional Programming Problems B 
” 
as sectior es ve ; 
ire = alae Dott kon concentrated on studying the solution methodo ge of 


naming pero aad In this section we move ahe 
a 


ta" pfe E 


C _ P 
“ > 
a 
zat 
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< siti i procedure for a subclass of the 
| . | | 
penne class of nonlinear fractional progr 
aie algorithm. Thus, when 
i nonlinear fractional program 


class o 
f nonlinear fractional programming 


amming probl ; 
e s 
we say a subcl MS 18 too large to be solved by 
ass we actually mean a 


ming problems in ‘uti particular class 
ic = 
ee are positive Gonves ¢ h the numerators are concave 
gynctions © Rhyne fre called ex Tunctions. Such nonlinear fractional 
amming P io voncave-convex fractional programming problems 
vote that for such problems the objective ratio is a pseudoconcave function | 
: he algorithm presented below is called the Dinkelbach’s algorithm, named after 
i Dinkelbach of Germany who first d 


— yeme | escribed this algorithm in 1967. Recall that 
| general nonlinear fractional programming problem is given by 
= f(x) 
| g(x) 
| subject to 
XxES={x ER" : h(x) <0 (i=1,...,m)}. (12.9) 


We assume that S is a non empty compact convex set and fig +R’ >R ae 
continuous functions on S with g(x) > 0, Yxe s. 


For instance the assumption that S is a convex set is true if each h; (i = 1,...,m), is 


a quasiconvex function on R”, as S = NM; {0 — level set of hj}. Also, S is a closed set if 
each h; (i = 1,..., m) is a continuous function on R”. 


The algorithm for solving (12.9) shall make use of the following auxiliary problem 
with parameter q € R. 


Max f(x) — 9g(x) 
subject to x€5. (12.10) 


Observe that if f is concave, g is convex and q > 0, then f —qg is a concave function, 


f 


whereas. — is not a concave function. This makes (12.10) easier to solve than (12.9). 


Moreover, f and g are continuous functions on a compact convex set S, hence the two 
problems (12.9) and (12.10) possess optimal solutions k S. 
Denote by F(q) the optimum objective value of (12.1 ), Le. 


F(q) = Max({ f(x) - gg(x) : xE S}, qER. 







j i ith the 
The function F has some nice properties which we would like to share w1 


€ A y eaders. 
“VaNe)Ts, k 2a 
T 
g 


7 a 
AAST 
i 


"E ie o , R. 
12.6.1 F is a convex function on 
N e Let x € 5 be the max paw 


Be x ae 

) AÑ DPE JAKID 

t F d naa 
z f =y á 


í — 
i < \ Je 3 fs 
-_ 


iras we 
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F(q) = Maxt f(x) - qg(x) : x € 5] 
= f (x9) — 98 (%q) 
= f(X_) - (Aq + (1- A)q2)8 (%4) 
= A(f(xq) - 118(X%q)) + A- A) F(%q) — 928(%q) 
< AMax( f(x) — ng(x)) + A - A)Max(f (x) — q28(x)) 
= AE) + (1 - A)F(@2): 


Thus, 
F(Agi + (1 — A)q2) < AF(q1) + (1 - A)F (x2), VA € [0,1], Yq, ER. 


This completes the proof. ‘ 
Result 12.6.2 F is a continuous function on R. 


Proof. Using Result 12.6.1 along with the fact that a convex function defined on an 
open convex set (here, it is R) is continuous in its domain, we get the desired result, p 


Result 12.6.3 F is a strictly monotonic decreasing function on R. 


Proof. Let q1, q2 E R with qı < q2. Suppose x2 € S is the point where F(q2) attains 
its maximum value. Then, 


F(q2) = f(X2) — q29(x2) 
< f(X2) — q18(x2) 
< F(q1), 
where the strict inequality follows on account of q1 < q2 and (x2) > 0. Thus, 


91 < q2 => F(q2) < F(q1). 


Thereby, implying that F is a monotonic decreasing function on R j 


Result 12.6.4 The nonlinear equation F(q) = 0 has the unique solution in R. 
Proof follows by virtue of Result 12.6.2 and Result 12 6.3 3 


Result 12.6.5 Let x* flx*) 
TA fo, is em a ctx E€ S and q* = SA) Then F(q*) > 0. 


P; 


Siae ak "o, n 

9 Oi y j fe x y L E 0 te s g 

1h Oy. Vi feet) â ts iV a ae 
Tees fond 






g 


4 -a m a a 
Sw) ew 


Z2 ee) ae 
Z j $, a: l 


ER j ana only if x* is an optimal 






i 
| 
r s 
ë «4 
Ps 
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ppv 
. sii f(x) -q g(x) S F(X") ~ g* g(x) 


| Consequently, x 


| | YxeS. 
IS an optimal solution of the problem 


* — 
FG”) = Max{ f (x) ~ gto)» ye s) 
D vith F(Q*) = 0. 
The converse follows by tracing the steps backward 
Similar theorem can be stated and proved for the minimization case as well 
n an | | 
t. o ‘Theorem 12.6.2 x* €S is an optimal solution of min {£ 


eG) : XE s} if and only if x* 


is an ee a of Min{ f(x) — J g(x): xe S} with optimal objective value zero, 


a l 
where, q Bix“ ) 





Dinkelbach’s Algorithm 


ea 
Fl a —— - 


As a consequence of the above results and theorems, it follows that there is a cor- 
respondence between the optimal solutions of the nonlinear fractional programming 
problem (12.9) and the nonlinear parametric programming problem (12.10). Taking 
clue from this, a mechanism is developed which solves the nonlinear parametric pro- 

o gamming problem (12.10) that in turn provides an optimal solution of the original 
nonlinear fractional programming problem (12.9). 

In view of Theorem 12.6.1, solving (12.9) is equivalent to finding the root of the 
equation F(q) = 0 which, on account of Result 12.6.4, is unique. An iterative scheme is 


Proposed to achieve this aim. 


We begin the algorithm with qo = 0 
Eq) > 0). 


(or we can start with any other value of q with 










F(go) = Max(f(x) : x € 512° 

| j be 

z — This problem is constrained convex program, hence, p a is 

eee AUR d by applying appropriate nonlinear convex optimization ecnnique. 

a ep ima solution of this problem. j 
lwo cases arise, either F(go) = 0 or F(qo) > © 

ET ai jnates. 
ba BF ststion of (12.9) and the process termuna 


s an Optus 
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F(qo) > 0. Set k © _— 
Suppo% ja f (xk) > kr 


-x ES} 2 0. 


and solve = Max{ f(x) - 180) 
F(qx+1) (which can be determineg by 


solution of this convex problem 


be an optimal 
Let Xk+1 j. Then 


some numerical technique 


Pige) = f +1) — gkg (Xk+1) Z OD gkg(xXk) = 0. 


s can be discussed. 
i Fue) = 0 then xp is an optimal solution of (12.9), and so stop the procedure, 
= 


Else if, E(gxs1) > 0 then set k — k+1 and continue. | f 
It is important to take note of the following points for implementation convenience 


of the Dinkelbach’s algorithm. 

(i) From computational view point, it is convenient and acceptable to pre decide the 
tolerance ô > 0, and then if F(q,) < ô, (instead F(qx) = 0 exactly), the procedure can be 
stopped. If F(q,) 2 ô, then the procedure is continued. 


(ii) At each iteration, a constrained convex optimization problem 
F(qk) = Max{ f(x) — qxg(x) : x € S} (12.11) 


is solved. Thus, it is sufficient to generate the KKT point of the problem. For the 
unconstrained case, when S = R”, it is equivalent to finding the stationary point of the 


problem provided the functions involved i 
: t - ; 
optimal solution of (12.11) in the problem are differentiable. Note that an 


parameters generated in th 














E c “a es hen value of q is steadily increasing 
graphically a a yy at with t ' unique root of wa . Siang strictly moa 
he eo a in Fig 12.4, the optimal soluja voy tthe equation F(q) = 0, as depict 


ANS that th, P 
-S Saat the algorithm converges t0 


Ai 
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| 
Proof. We frst observe that q* is an upper bound of the sequence {qx}, 1-e. 9k = GaN. 


This is because for q > apse Ẹ (q) < 0, and the algorithm. would have stopped much before 


this situation to arise. 
Thus, {qx} 1s a mono 
be convergent. Let 


tonically increasing sequence bounded from above, hence, it must 


lim qx = 9: 


k- 00 


As F is a continuous function, 
lim F9 = FQ) 


By construction of the sequence in the algorithm, we get, 
F) =9=F (q*). 

s completing the proot. g 

y the Dinkelbach’s 









Invoking Result 12.5.4, 9 = gE Lau 
Example 12.6.1 Solve the following nonlinear fractional program b 
algorithm 
ae oy? +4x+8y-8 
we et yr —6yt8 
subject to 

x+3y <9 


x, y29 


r A 
X z == aje : i — 
~ — E 3 . 
gat É ; b 
y re Ai x 
’ * — 


3) -1, is a strictly convex function 
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474 Numencs i — 8, is a concave function on S ag y2 f 
ty) = Or li - | 
Also, f X, y iiih 
-6 0 | is a negative 
at 0.001. Sta 


Set the tolerance 6 = 
the results for two iterations, 





TY) s 


definite matrix. 
aa ° ae . á ia F ‘ 
rt the algorithm with g 0 e following a f 





F(0.522) = f (x2, y2)— 0.522 2(X2, y2) = 0.000451 < Ô. The optimal solution of the given 
problem is x* = 0.4066 and y* = 1.5312. 


12.7 Summary and Additional Notes 


e In Section 12.2 and Section 12.3, we introduced several new classes of generalized con- 
vex functions. Many important properties, including the local-global solution char- 
acterizations, are derived for the newly defined classes. Although we have carried out 
the in depth discussion on these classes of functions but for more reading refer to 
3 excellent texts by Avriel et al. [6], Bazaraa and Shetty [11], Mangasarian [109), 

artos [114]. The research articles by Ferland [56], Ponstein [129] and Greenberg 


SLOT +, 
ri YY, 





Tee ia: i“ To kr 

Je Can refer t 7 ee IOW more abo . : 

| and Sah, z) bse by C ut this class of optimization pee 

5 ae and Thanale t4 44 -Minas 
Jon F905 dm ari: E y articles by Stancu-M 


p 


\* 








À amming problems. - ol 


a i E 
Ss. 


R o 7S is the a é 
ay U tee Se -a a ¢ 
~~ *“ler to the book by . Ree 
Bee by & 
a es g 





"O b 4 
à $ = 
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ave bee 


d aotic TENI Á I reporte | f Jo 
è 4) types of fractional pi ogrammins ne dan literature f i 
a l ian [147] and Schaible and Ibar 48 Problems, The excel] nrg te 
' \finasian baraki [141] provide ent survey by Stancu- 
realize and SUBEDI the fact that solving a ge © good references in this regard, 
eS ee problem is very hard and challenging ier nonlinear fractional program- 
orted in recent years to solve some special ne successful attempts have been 
like, piecewise-linear fracti 1a! structured fractional en 
roblems, 11 »P ear Iractional programming prob] al programming 
amming problems and stochastj Problems, minmax fractional 
| prog? c fractional programming problems 
| 2.8 Exercises 
C R” be a nonempt Be bol 
i 6 on S with f( on set containing origin and f: 5 > Rbag 
| quasiconver ; x)= fO Ees Show that RU = f(a), YA co 
| 12.2 Let 5 © R" be a OPAD convex set. Show that the function f : S > R is 
convex on S if and only if for every x1, x2 € S, the one-dimensional function gi 
A) = f(Ax, + (1-A ; 
0,1] 7 R defined as, 8( 1 )x2), is quasiconvex on [0,1]. 
R- 1 T T ‘ ; 
ie 12.3 Let f(x) = T Qx +c x, Q is annxn symmetric matriz. Show that f is conver 
1t mR” if and only if it is quasiconver on R”. Is the above statement true if we restrict 
O the domain of f to RY, i.e. x € RẸ ? Justify your arguments. 
1, | 
: 12.4 Let {fy : y E I} be an arbitrary family of quasiconvezx functions on R”, and let 
g(x) = SUP, el f(x). Show that g is a quasiconvex function on R”. 
: 12.5 For y > 0, consider the following family of functions, 
»- 
1S x? + y(x — 10), ass 
f(x) = 4 yx — 10), oe 
y @—1) +y@-10),  Tse=- 
t, 


pseudoconver function on [-1,2]? 
ex or a strictly quasiconver function on 


|? Justify your answers. 


For a fired y > 0, is fy a quasiconver or a 
g Let g(x) = SUP, > 0 PORER a quasiconu 
1S [-1,2]? Is g a pseudoconver function on [-1,2 7 
nction defined over a conve? subset S of R’. 
ion on S. What can you say about the 


onvex function? 


R—R be a nondecreasing 
LCONVEL 
n — R, ts a quas 











12.6 Let f be a positive quasiconcave fu 
Fe e w that g(x) = 1/ f(x) is a quasiconved fune 

Tciprocal fe nction, 1/f, if f isa positive quasie 
3 nction and let 8 : 


o 
ic 


oD ) et u 
Rabe g guascon be f= 
hat the composite func TOt 


7 
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` r set and Q, h ‘Ss R. Define i 
ypty conver Ser e i p 
git S CR" bea nonem} iconver if the following 4 k 
er R by f(x) = ghax). Show that f is quas g two eo | 
hold 
<0, Yxe5 | 
i) ç is conver on S and g(x) < U, 

hy 4 is concave on S and h(x) > 0, YxE 3 

12.9 Let S C R" be a nonempty open convex set. Consider a function Ea 


with f(x) > 0, Vx € S. Show that if In(f) is a concave function on S then fia, | 


pseudoconcave function on D» 


12.10 Solve the following linear fractional programming problems by both the Chita | 


and Cooper method and the simplex method | 
a DX ek 
(i) Maz A eae oT 
subject to 
Oo (6 
2X, +X. 2.4 
ipo O. 


(ü) Maz TAa 

Mate) tee 
subject to 

3X1 + 5x2 

4x1 + 3x2 


X1, X92 


15 
12 
0. 


IV IA IA 


gı 


(iii) M ar z= X2 T 


subject to l n 


2x1 + 5X9 
4x1 + 3X9 
ete Xi 
X1, X2 


10 
20 
2 
0. 












IV IA IA IV 


nee ri ie Og ee i with 
‘associate a linear programming proble 


- i i : Š 
T Lf { f le TY; 3 a) ay ey YAN ~- r ; 
MYC LTANSTOrTM¢ tion, 
J bles AD 4 Uy 
j nN Y ay y if) Nr yA fi: Pas A T MNA; ! 
NG problem, (LFPP) An 
$ Lee ohn of 
fc I Yad i P i 


Pale 


er 
> 


Eg CFS 
VU 
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J A | PA : TO —~ 4 p 1 
" vet (Y" w ) os an APOG solution of the li ani Rk ii 
pow thatx = y [w* is an optimal sol fees programming problem LEU" #0 i 
i yer’ ae 7. u 1 i W ' 

T eons values of the (LFPP ) and the ek A ‘8 the (LFPP) and the Teese iii 
fune” pi 

ome J "near program are equal. i 


a. Suppose w~“ = 0 in the Charnes 


4° 


pasible set of (LFPP) is unbounded, and Cooper transfo | 


rmation. Prove that the ti 


2.13 Solve the following fractional | 
12.18 a! programming problems graphically 
(i) Min 1 + 2x2 +6 
(X1,%2)E€S 3% +5 +3 


(it) Min 1 +2+6 
(x x2) ES 3x1 +5x2 +3 


where S = (1, x2) AT < 2, 2x1 — X2 < 6, x4, X2 > 0}. Which of (i) and (ii) has an 
alternate optimal solution? Justify your answer. 


12.14 Solve the following fractional programming problems by the Dinkelbach’s method 


, 5x1 + 3X2 
(2) Maz Zz = Bab 2%, +1 
subject to 
38x, +5x%. < 15 
5x, +2%2. 2 10 
r 2 0. 
Pe eet 5s 2x, + 2x2 +1 
in (i) Ma 37 +43 
z s a subject to 
ag yty < 3 
a> S 0. 


ae ae ee 

(ii) Mot 7° 324x541 
subject to 

X1 + X2 

E y * *2 
eS ee 


1 
0. 


® s re 
iJ 











13 SN 
\fulti-objective Optimization: Theory and 


Methods 


aa 


13.1 Introduction 


By now we are well versed with the concepts of linear and nonlinear programming prob- 
ems involving optimization of a single objective function. We now take a step further 
and enter in the area where the programming problems require optimization of more 
than one objective function. To get comfortable with this idea, we begin our discussion 
by considering a very simple decision making problem of buying a car. Let us pose a 
question to ourself. What features we will look for in a car of our choice? Cost, comfort, 
space, mileage, safety, engine power, height from the ground level, power steering, color, 
and may be some additional technical features, like, power windows, quality of AC, 
and so on. From this list it is obvious that a decision of buying a particular car is not 
based on a single criterion, of say cost alone, but many more criteria are equally vital in 
the final decision. Problems of this kind can not be modeled through a single objective 
optimization problems. One needs to look beyond that and talk about multiobjective 
programming (MOP). As the name is self explanatory, a multiob jective programming 
problem (MOPP) deals with optimization of more than one objective criterion. MOPPs 
are frequently encountered in problems of product design, management decisions, re- 
source planning, to name a few. This branch of optimization is also known by other 
names, like, vector optimization, multicriteria optimization, multiattribute optimiza- 


tion and so forth. 

eee zpicr, we shall be en ae ti oblems. We shall be describing 

Where the ‘ective optimization pr ; 
ae ee pe yas on 3 auek i 4 ‘ated with the class of MOPPs. In the 
and characterizing various solution concepts associated al programming 
ater part of the chapter, we shall be discussing a technique, ae A : es a Oke 

| a | : ‘obiective progr v 

me. to salve a narticul S of linear multiobjec S 5 
Pics “ree ‘i a. e vast and complex topi¢ of MOP. We oe 
liar with the other exities and restrict ourselves to understand the 
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13.2 Conflic it is evident that some of the 


of buying a Cat, l Severaj 
m ke, more comfort can generally be achiey ed by 


e technical features will also enhance the 
but generally gives less mileage. Ip pr 
mmensurate objective functions tha 


oble 
k to the pr | we 
nflicting in nature, li 
» advance 
increasing the cost of a car, more a sara 
of a car, luxury car provides more c Aia a 
too, MOPP involves several conflicting a 
i bi 


ver a feasible region. | 
ve to be optimi imultaneously ovel a feasl | : 
have to be optimized simul wall | ons, cov. ai 

he instance, consider two objectives represented by functions, say, fi(x) = x | 


2 We wish to minimize them simultaneously over the a [0, 1]. | 

hx) = (x-1). We w d f(x), and observe that while fi is increas, | 

We may sketch the graphs Uan j a situation where it is not nos; 

in [0, 1], f is decreasing in [0,1], thereby r esulting ia possible | 
to find an x € [0,1] that can minimize both the ob jectives. maa E. 

If we can find a feasible vector x* that optimizes all the objective criteria simu 
taneously then we have certainly achieved an ideal solution of the problem. But quite 
often, improvement in one criterion results in a loss in another criterion leading to the 
unlikely existence of an ideal solution. It can be seen as some kind of tradeoff between 
various objectives. Visualizing and resolving the tradeoffs is one of the key aspect of 
MOP. For this reason, one has to look for the ‘best’ compromise solution. Now, ‘best’ 
can be defined differently in different situations. In economics, ‘best’ is referred to the 
decisions taken by the buyers and sellers or the governments which simultaneously op- 
timize several criteria. One of the most frequently quoted example thereof is Taxation. 
An optimal collected tax is one which maximizes the revenue for common goods while 


maintaining sufficient incentives for individuals to earn reasonably good income from 
their work. 

Not surprisingly, the first 
Y. Edgeworth . He defi 


Going bac 
listed criteria are CO 


hair of politica] economy at the University of | 

“The optimum allocation of the resources of 4 

ossible to make at least one individual better of 

ers as well off as before in their own estimations”: 

a 8 ty. The very same T the balance of the socio-economic struct 

“ance Detween es. Pplicable to MOPPs. One needs to create # 
nse. Of coy rse 


d, wrote in 1906, 
d so long as it is p 
ean, mation while keeping oth 
f any coca Statements beautifully c 


Fo | get E A cy 
OI any sociat: 
ag l LY Soc zs o) L 
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7 in his own estimatio 
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33 Various Solution Concepts 
13 


i . > th aena 

E- following form of MOPP is studied in this chapter 

i Min f(x) = 

i x) = (fi(x),..., s(x 

| x€S fx). (13.1) 
i we assume that S is a nonempty subset of R" 

i, and f: S — RP is a given you. p cobresenting the feasible set of 
| terja tO be minimized. unction comprising of p objective 

cr 


For p= L problem (13.1) reduces to a scalar nonlinear 
has been a subject of study in th 
we take p 2 2. Moreover, it is no 


s ee programming problem which 
previous chapters. So, for the multiobjective case, 


tn a, ae: 
: Se ee ecessary that all the p objective criteria are to be 
minimized, some criteria may involve maximization process. For instance, in a car buying 


problem discussed earlier, we would like to maximize comfort, maximize mileage and 
minimize cost of a ae Actually in context of modeling the problem it does not matter 
| whether we D uga minimization or maximization problem. One can convert all 
the maximization criteria into the minimization form by using the identity, Maxfi(x) = 
-Min(—fi(x)). With this understanding, we study MOPP in the minimization form in 
problem (13.1). Observe that we have yet to define what we mean by minimization of a 
vector objective function, f(x) € RP with x € S, in (13.1). 

The basic problem to realize here is that, unlike the real space R, the space R’, p 2 2, 
is not an ordered space unless we define an appropriate partial order. It simply means 
that given any two distinct real numbers x and y it is always possible to determine the 
greater among them. But the same is not true if x and y ae vectors in RP. For instance, 
it is not possible to compare the vectors, (2)1)* and (1,2)", in general. " 

Now, for x € S, we get the resultant objective vector f(x) in RP. Let f(S) = {y os | 
dx € S such that y = f (x)} denotes the image set of S under fa To define an ep rot 
solution in the sense of minimization we need to compare vectors 1m the image set f(S), 


: ; i p 
and for this, we need to identify a partial order relation in Ea 
To appreciate this aspect of MOPP, we consider the pro 


Min (x7, s 1)°) 
x € [0,1] 












yi, y2) € Í (S), there exists x € S such that 
a, ne 2 G= (0 1], m2 0, y2 20. Moreover, x = V¥1 
A ia Y, Le. yy = xe, Y2 = (ee 1) ‘ As. S Y yı > 0, y2 > 0, Ji + V2 a 
ind (t~1) = yy yields the set, f(S) = (Wr ¥2 f(5), say (1,0) and (0,1), or (3, 4) and 
w if we ¢ hoose any two vectors in the Bel f i the two unless some priorities are 
AET yhich one js greater O 


9 = < a s - } } , 
A i oO sie AS 
T geciac as ” 


i am i 2 = 
J Uet yy = 2? and yo = (x - 1). Fory ( 


obiective tuncuc: si, n concepts for MOPP (13.1) i 
SEN n the SOUU? ~- 


ran ITE space R k 
L wee ” 
af $ n 
E 
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R? is partially ordered by a binary relatio 


E 10 >. 
`s we mean the following. F f Meg 
thant of RP. By this . For x ye 5 


In this chapter We assume that 


ive or 
by RË, the nonnegatlv aay 
nA SR? y & y = +7 | 
y xE RÊ \ {0}; 


x SR? y © 
x € intR". 


X <p Y S eee 


2 ; 1 1 
mas ( 2) xe (3 ) wie (2 ) se (3 


It is important to note the difference between X Spr Y and x Spr Y. While the fi 
one means x; S yi (i = bit Pr bHE latter one imply x; S y; (í = Lapa £ j) ang 
x; < Yj, for some j. From now onwards we suppress the subscript R‘, in the above 
relations. Moreover, for x, y € R, we continue to use x < y to denote x is less than equal 
to y. The partial order < is to be understood in the right sense in the given context. | 

We are now ready to introduce two solution concepts for MOPP (13.1). : 


Definition 13.3.1 (Weak Efficient Solution). x* € S is called a weak efficient so. 
lution of problem (13.1) if there does not exist x € S such that f(x) < f(x*). In other 
words, it says that whenever x E€ S, f(x) — f(x*) ¢ -(intRË ). In set notation form the 
same can be explained as (f(S) — f(x*)) (\(—intR® ) = 0). 







Fig. 13.1. 






3-3-1 In two op qe 
k efficiency. 1 cunensional spaces, we ca hical illustrate ™ 
ee, +e, fi Sec R*. Th sA 
7 Q point f(x*) + a k oC above definition suggests 
ee On of (13.1), see Fig 13.1. x™* i50” 
Presented by f(S) whereas 2°” * 
ests, ' 







i "3s 
A Bi p Üss, i a 
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i e that the weak efficient Solution of MOPS a 
IS no 

f pxample 13:3-1 Let S = {x = (x, X2) i 2X) +x 

j ATE ae Rê be defined as f(xy, x) = Pi We far a 1+ (xy=1)2 

“1 +42). Consider t 

and identify the set of weak efficient solutions. 


Fiia t ry 
Cséinttion, lhe examples also 
necessarily unique, 


S 4, X1, X2 are integers | 
e problem Min f(x), 
XES 








h 
10 O 
Sİ 
d © © 
Le 4 © O 
© 
ul 2 © © 
l kae] Å 
2 4 10 
d Fig. 13.2. 
T 
e 


Solution Observe that S = {(0, 2), (1, 1), (1,2), (1,3), (2,0), (2, 1), (2, 2), (3, 1)}, consequently, 
f(S) = {(4, 2), (2,2), (5,3), (10, 4), (2, 4), (3, 5), (6, 6), (4, 10)}. 
The sets S, f(S) are shown in Fig 13.2. The set of weak efficient solutions of the 


problem is given by {(0, 2), (1, 1), (2, O)}. 


Example 13.3.2 Let S = [0,1] x [OL ana soos R? be the identity map given by 
f(x,x2) = (x1, X2). Consider the problem 
Min f(x), (13.2) 
xES 


and obtain the set of weak efficient solutions. 
Solution Obviously f(S) = S. x” = OU RN ii arte ji tae 
Be exe 5 with m1 <0 and 22“ mo Baas on exists no x € S$ with 
there i i < 1 and x2 < 0, in 9 In fact it can 
J k a a ee oly (1, 0) is also a weak gee ne os - oe fey 
“asily be seen that all the points 1 the set {(x1/0) : l 

are ‘weak efficient solutions of (13. 2) 

Ey eee) va et S = R2 ahd E S> as Re be defined as fx 

ah ine ay “ solutions of Min, fO) 


i ot ee 












X2) = eo ss Obtain 


weak efficient solutions of the 


bne s the set -A 
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1x2 2 1,%1 + X2 2 4, maxļ|2y 
Ix = (x1,%2) H Xi 2 r” eed 


1, 3x \ 


se 2} > 
13.3.4 oy a = fep ta): Obtain the set of weak efficient Solution 9 
] R? be Xi; | a 
and f : S > 
Min, f(2)- 
reS 


X2 
f(S) 


X1 


Fig. 13.3. 


Solution The set f(S) is depicted in Fig 13.3. The set of weak efficient solutions is 
marked by a bold curve. 

We now define another solution concept called efficient solution of MOPP (13.1), 
The idea of efficiency is based upon the conviction that no criterion can be improved 
without worsening at least one other criterion. 


Definition 13.3.2 (Efficient Solution). x* € S is called an efficient solution of prob- 


lem (13.1) if there does not evist x € S such that f(x) < f(x*). In other words, it says 


that whenever x € S, f(x) — f(x*) ¢ ERE \ {0}). In set notation form the same can be 


explained as (f(S) — f(x*)) \(-RP \ {0}) = 4. 


This solution is also known by the names Pareto solution or non-inferior solution. 


Definition 13.3.3 (Efficient Front; T 
i tier). l 
(13.1) is called efficient frontier ontier). The set of efficient solutions of MOPP 








from Definition 13.3.1 T, i 
saci SER .3.2 that every © 
efficient solution o efinition 13.3.2 tha 


Remark 13.3.2 It follows 
JICIENT so 
2 of the followi ng examples just f problem (13.1 ). But the converse need 1 


lution is a weak 
i ify this fact. 
in f), ond idensapy pa $ "2S O} and f :S R? 
Ree O efficient solutions, 


i 
y — 


be f(x1,X2) = (x4,%2): 


i z - Gai hi ae: qe $ aby 
~~ Utlons is an empty set because aah 
“Sale 


y 
ha 







o gar 


L 
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gamely, (1, 1). Also, recall Example 13.3.9. It jean 
jo, of the MOPP considered therein whil cle 


yntable subset of S. 


l » : 

ar S has only one efficient solu- 

e the set of aS A90) is tho only efficient 
SY OF weak efficient solutions is an 


le 13.3.6 Let Saits (x1, X2) “ne 


EX . d . i [0,1], Wt <x» < A) Identify the se 
efficient an efficient solutions of Min fly. aia 1}. Identify the sets 
af weak ffi res f(X1,x2) = (x1, X2). 


solution The set f(S) = S is depicted in the first figure in Fig 13.4. Here. both the 


set, 


X92 " 


$) 
f(S) p 


xX} 


Fig. 13.4. 


of weak efficient solutions and the set of efficient solutions are equal to the singleton set 
Jo fficient solutions. For 
However, we need to realize that a MOPP can have many efficient so = 
instance, in Example 13.3.4, the efficient frontier is {(1 - A)(1,3) + A(2,2) : € 
(0,1]} U{(3, 1)}. The following examples also confirm the statement. 
x1 < 
Example 13.3.7 Let S = {(x1,2) : X1 = 1} Uile) : i < 0, x2 > 1} U{(x1, x2) : x1 S 
0, % <1, BIT XD) 2 0} Ui, os 0 < A aR, efficient frontier. 
Consider Min  f(x1, x2) = %1/%2) 4 
(x1, x2) E 


f the set f (S) = S is presented in the second figure 
O 


Soluti ical illustration 
olution The graphic described by 


S — x À 0, 1}}. 
ayy rcomvuia-AC-Y* ee aka 


if instead we have 
f efficient solutions 


t) 
=~ 


d set. Moreover, 


nnecte 
ot a CO x1), then the set O 


frontier is D 
y (x1, x2) = (¥2 
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rogramming problem 
der 

e 13.3.8 Const 
Min 


subject to 


l 2 
Tas F (—x1, X1 +3) 


xi — X2 <0 


xı + 2X2 < 3, 


its efficient frontier. | | 
ae The feasible set S and its image set f (S) are shown in Fig 13.5. 


Y2 
(1.5, 3.5625) 





(abs. —3(2)1/5) 


Fig. 13.5. 


The efficient frontier is given by {(x, 


| Through the above examples it is 
is one for which if one criterion impr 
In the sense of minj 


rE mization. 
_ feasible point than In other 


> 3 
x2) : xı € |-3 42, 1], Xo = e 
clear that the efficient solution of MOPP (13.1) 
oves than at least one other criterion deteriorate 
és the value of at nin hain “ee one criterion decreases e 

poni an ry on e e a ible 
i a se “Owever, it is possible that i eo Miterion MCCS SOS ie a pi 
antral he O11, a po 









yx) KORN ry proveme 7 - . . - 
pa a In the other criter; he im One criterion is marginal as comP# 
. e T1cie n S olu; 3 e e. 


t vac 






tf TAK nmal. i . +300 
roperly efficient cole: lon [67] introduced a sharper ne 


eff 


ution). x* € S is called a propel 
ne solution of problem (13.1), 4 
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Ty indez i(i = 1 

J (j =1,. 
fi(x*) a f(x) 
ere nAdA < 
fix) = fi(x*) > M. 

A properly efficient solution is sometimes refer to 


- . k as Geo rion > ; > 

on or properly Edgeworth-Pareto optimal solution, We e properly efficient 
te properly efficient solution. ul restrict ourself to call 
i 


pP) and every x € S with 
‘+/P), such that filx*) < ligt end 


mark 13.3.3 (i) It follows from the definition that i 
a efficient solution o f problem (13.1). at every properly efficient solution 


ji) An efficient solution which is not a properly efficient solution 


, ts called an im l 
efficient solution. ld a 


Sl IR ce A att gy "a a taal, a a eT A l 


Definition 13.3.5 (Improperly Efficient Solution). An efficient solution x* is 
called an improperly efficient solution of problem (13.1) if for every real number M > 0, 
shere exist an index 1 G= 1,...,p) and some x € S with fi(x) < fi(x*) such that for every 
index J (j= 1,...,p) with fi(x*) < fix), we have, 
(x*) — fi(x 
fi(x™) Ji) Shi 
Fix) — fi(x*) 
The commonsense reasoning says that improperly efficient solutions are not desired 
as improvement in one criterion comes only with a large sacrifice in the other criterion. 


£ Example 13.3.9 Consider the linear multiobjective programming problem 
Min f (x1, x2) — (x1 + 2x2, 2X4 — X2) 
subject to 
x +x <1 
Spon Z0. 
E (13.1) Identify the set of efficient solutions and show that x* = (0,1) is a properly efficient 
-eriorate solution of the given MOPP. 
at gome Solution Let y1 = x1 + 2%2,¥2 = 2x, — x2. Then the feasible set ro ea to 
feasible the set f(S) = {(y1, y2) : 3y1 + Y2 $ 9,11 + 2y2 = 0,241 — Y2 = 0}, P Ta 0 iI) ome 
ae | It is clear that tp set of efficient solutions is given by {(0,%2) : x2 € WY, 44- 


es) = (2, =), and for any x € 5, x + x, Yı < 


an efficient solution x* = (0,1). Then f( definition of properly efficient solution. It can 


| 2 %>-1. Take i = 1 and j = 2 in the 
o | eş yb i i J 

tid si verified tha h — ful) 3 2-y1 AA 
be fa) 50") y2 +1 
ei. f the MOPP. 








Mavis a properly efficient solution 0 
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å weighted Sum Approach 
13 


ference recall the MOPP 
Min Xs 
| ae F(X) = (filx),..., fo(x)), (13.3) 
| 
i ‘o the feasible set and f:S — R?, »>2 is the objective function, 


ror arbitrary Ài > 0 (i= 1... 
| \oPP (13.3) bi 


p 
, P), Lia Ài = 1, associate a scalar problem with 


p 
fee D Aifile) (13.4) 


The scalar Aj ( = 1, a /P) can be interpreted as some positive weight or priority 
| igned tO the i-th objective criterion by the decision maker. The sum of all the weights 
z ysually taken to be one to ensure that the criteria are appropriately scaled and 
. relatively placed. We Dee a hypothetical situation for clarity. Suppose p 3 and 
= 1 Ay = 3) Az = <. It means that the first objective criterion is 3 times important 
| “compared to the third objective criterion and 1.5 times important in comparison with 

i second objective criterion, while the second objective criterion is twice important 
T omparison to the third objective criterion. Equivalently, the objectives priorities are 
oe 3-9-1 by the decision maker. 


From now onwards we take, 


set 


p 
= AeA ER OEE Lp) 2a 


( ) 


dA EA. 
Th 13.4.1 Let x* € S be an optimal solution of problem (13.4), for fixe 
Theorem 13.4. | 3) 
Then x* is a properly efficient solution of problem (13 ) | 
— is an efficient solution 


that x” 
oe (13.3) is worked out. 


; ; . First 
Proof. The proof is achieved in two parts Mea for proses 


of problem (13.3), and later, proper efficiency 
Both parts are proved by contradiction. 
a Suppose x* is not an efficient solution 
ee Be 
= m 1€X 1, such that x 
a ae n . < (x ). 
os. ~ Bng) (j xq as j + i), and fil) fi 
Ji S P J i 
~ ows from (13.5) “i 







f problem (13:3). Then there exists x € S and 
O 


(13.5) 


= ETORT oA 2 
-2 pe : 
ea TE TO 
Tit. tt FS 
er <4 fe > E be 
p , O £ wa 


z a 
~~ k = 
a i 
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Yafo) < ye afik jı 
i=l 


i=l 
* for problem (13.4). 


‘mality of x . 
the optimality O erly efficient solution of problem (13.3) 


adiction to —— 
a contr hat y* is an ymprop 


Next, assume t ee 
a for some index 1 U = 4. 
M=(p- Le, hy Then, for s 
Filx) < filx™), we obtain, 
i) >M (j=1,...,p) with fj(x) > F(x"). 
pa dy 


: Choy, 
..,p) and some x 
S S with 


This implies 
À i 
fi) — fila) > MG) - fi") 2 P - DT AED G = L 


Multiplying throughout by = and summing over all j (j = 1,...,p, j # i), yieldş 


UF) = fi) > dy Aj (Ae) = f). 


J=1,] #1 


Consequently, 
p p 
OE E Af, 
j=1 j=l 


again contradicting optimality of x* for problem 


from the two contradictions (13.4). The conclusion is now evident 


Theorem 13.4.2 JË ; 
‘4.4 Let S be a conver 
onS. Ifx* ES is g pe SR cate Lip) be tion 
inh shea ee a ky efficient solution of problem (13 3). i a conver func $ 
optimal solution of problem (13.4 Voih th, en there enisi 
We skip the a Tag ers 
Proof here as jt i 

_ Convex sets’, Howey Squires the know] 
aia er, we edge of í of 
[67] for the detailed aan recommend the readers to ee oa ae y fio 

ee. d proof. arc icle of Geo 
“xample 13.4.1 Gon). 

ES 3 ji teria linear MOPP 

el (—2x, + X2, x} — 2x2) 







= 












wee. E PES â si 
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Example 13.4.2 Consider 


f(xy, %2) i 





Min : 
rect to m 
subjec -3t 29 y 
04x143 PN 

< xX) $2. 
0 < %2 FA 


Construct the weighted sum scalar problem, and analyze the same in the light of Theoren 
ons > 


13.4.1 and Theorem 13.4%. 
Solution The feasible set S and the 
The efficient frontier of the problem 
it is described by 

a a): x = %2,05%1S MOO) 2 S a < 3). 


corresponding image set f(S) are shown in Fig 134 
is marked in bold in the first figure in Fig 13.8 a 


fo 






Fig. 13.8. 
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o a scalar nonlinear prog 
functions fi- À 


ramming problem (13.4) with the - 
weve een objective € A in problem (13.4) is appropria! 
i mean o, a è 
ight vector. ha ; 
refer to as a weg 9 the weighted sum approacn seems to be Suitaby 
(ai) In view of the Theorem 13.4.2, I nonconver MOPP, it is possible that 4 


, for a genera 
conver MOPP (13.3). However: f d using this approach. Despite this, p 
s the 


every be determine 
y efficient solution can | | 
ee gate well for a large class of generalized conver MOPP with conver i Picie 


faa) The above two theorems characterize properly efficient solutions of MOPP ( 13.3) 
Similar characterizations can be derived for weak efficient solutions and efficient tain 
tions of MOPP (13.3). The main principle behind these characterizations 18 same, Ij 
relates various solutions of MOPP with optimal solutions of suitable scalar nonlinear 
programming problem. The difference emerges in the choice of the weight vector À e R? 
We summarize the results in the following two theorems. 


Theorem 13.4.3 Let S be a convex set and each fi (i = 1,...,p) be a convex function 
on S. Then x* € S is a weak efficient solution of problem (13.3) if and only if there exist 
Ag>O0G=1,...,p), a A; =1, such that x* is an optimal solution of scalar problem 


fe 
Min A; AX). 
pee 3 File) (13.12) 


Tapore 13.4.4 Let x* € S be an optimal solution of problem (13.4), for fixed À €A. 
Then x* is an efficient solution of problem (13.3). Conversely, let S be a conves set 
and each fi (i = 1,...,p) be a conver function on S. If x* € S is an efficient solution 
of problem (13.3) then there exist A; > 0 (eb P) A. =] h that x* is an 
optimal solution of problem (13.12). SS aaa eee 


| efficient solution of problem (13.10), and if we 
1S an optimal solution of the weighted scalar problem 








Ing it convenient to handle the problem. The 


H st wide 
Miculties in choosing the ee! to solve MOPP. Of course there are so° 
erg “Ate UIC on x £ at . x 
€ the decision maker pih aa On of weights. The weights reflect ho” 
Da aE WISHES to Tive tn j “ea iteri in 
v MALS Precise nymerij, j 0 
_, . numerical values of the weights: * 
_~ ~JWCtIVe Is to decrease the manufactur” 6 


ze =- | ; AA v , sa vA NZ. ALLAS 
a Y “ee =~ ing’ Piz i 
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red AEA. 
conver set 
t solution 
t x* 4s ae 


and if we 
. problem 


larization 












b goals rather than exact achievement of the 
i “scussion we pause here to de 


O Åq 


TEPEN 
ulti Objective Optimizati 
on: 


side more basic facilities to the weake 

e time wishes to decrease the subsi 
the ibilities and aids 
jal responsibilities and economic deve 
weight vector. There is no m 


and the set up of the problem. Furthermore different we; 
x à ) 7. we 
generate different properly efficient solutions of MOE Ae vectors need not necessarily 
_1l and A = = give the same optimal Solution (3 5) alee al in Example 13.4.1 
. . f 7 is 7 
of search es ia arene the properly efficient solutions of es iin ie hrii 
proach. The root cause behind this is that the ma OPP by weighted sum 


pping depicting the relationshi 
petween the weight vectors and the properly effici  TOANIODEDIP 
a y emicient solutions of MOPP is usually not 


The practical limitations of the weighted sum 

reas approach motivat 
another approach which inherit the basic ideas of MOPP and at e T aep T 
tationally easy to implement. The search land us to ‘goal programming’. We shall na, 
on this concept 1n the forthcoming sections. | 


ei 


13.5 Formulation of Goal Programming Problem 


Rather than asking to provide the numerical value of weight for each objective crite- 
tion in MOPP (13.1), the decision maker is asked to rank the objectives according to 
their perceived importance. He is also asked to set up the aspired target values for each 
objective criterion. For instance, suppose 5x, + 2x2 + 4x3 represents the profit function 
and 10x; + 3x2 + 5x3 represents the maintenance cost of the finished goods inventory 
of some organization. An interactive discussion with the decision maker in an organi- 
zation reveals that his first priority is to maximize profit and then to minimize the 
maintenance cost. Further, he also aspire to obtain a profit of Rs. 5000 and curtail the 
maintenance cost upto Rs. 1000. In this way, a dialogue with the decision maker reveals 
additional information of the aspired values for each objective criterion that can be used 
in formulation of a problem in a more nt a 
The method of goal programming (GP) is based on SHS 
formulating an fe problem in such a manner that ensures a 2 a 
criteria come close to the specified aspiration levels m order of priorities set UP : s 
ms at satisfaction of the 


decisi s t goal programming ai 
cision maker. It is worth to note thos & ak Before proceeding further with our 


fine related terminologies. 


his idea. It actually consists of 


numerical value speci- 


SE, el is the 
). Aspiration lev evel with regard to the 


is desire OT satisfactory 


efinition 13.5.1 (Aspiration ars 
y the decision maker ti at re ects 


A wu owe 
a * 
= eee to 
~ ae OS ae ae i i : -i 
tion under consiaele a. 
i MINAN a A = 
k" 


« (ie 
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(Goal). An objective fun 


ction along with its aspiration leve] 
is 


Definition 13.5.2 


§ oal. 
termed as 9 ‘Goal Deviation). The difference between what we actually achie, 
ö 3 


sA | deviation. If the goal deviation ig nie 
we to achieve is called goa | | Stine 
and what we aon r Peal a3 the actual achieved value 1s more than the , 
it E en Aa the Oier hand negative value of the Ky deviation jn 
E fradi of the goal as the aspired level s more than what we coud 
unde 


manage to achieve. 


Definition 13.5.3 


dicate. 
Actually 


nization is described by a function 5x4 + 2% 4 


4x3. Now the decision maker wishes to attain a pee a ah “sie Bi ae So, the 
aspiration level of the decision maker is 5000 with regar x ine JEC a unction and 
the goal is described by an inequality 5x1 + 2x2 + 4x3 > 5000. or SONE easible vector 
ge pnt xt), Bie 4: Do T 4x% > 5000, then it shows that by taking the decision 
x* the decision maker can achieve more profit than what he aspired. Whereas if for al 
feasible x = (x1,X2,%3), 5X1 + 2x2 + 4x3 < 5000, then no decision by the decision maker 
can help him to attain the aspired goal. This situation represents underachievement of 
the goal. 

One obvious question arise here. How does the decision maker arrive at a figure of Rs, 
5000? Arguably setting up too high value or too low value for the aspiration level is not 
advisable. First of all we assume that the decision maker is rationale and knowledgeable. 
Secondly, in a MOPP (13.1) with p objective criteria, one can solve p individual scalar 
programming problems 


For instance, suppose profit of an orga 


Min ae) = (= ee my (13.18) 














Each of these problems, being single objective nonlinear constrained programming 
problem, can be solved by the techniques described in earlier chapters. Once the optimal 


values, say fi(x*) G= 1,...,p), are known, they can be used as aspiration levels for 
the respective objective functions. 


We now turn our attention to formula 
F problem. 
i l Suppose the i-th ob 
HER level is specified by v; 


te a mathematical model of goal programming 


jective function is described by NE we R”, and its aspiration 
E R. The possible form of the i-th goal is 


either f(x) < v; 
= 1 
or fi(x) > G, 
or fix) — 75, 
Introducing two additional +. 
Som additional ve 


| Variables. d- > ove 
TON SSR z0, d? > 0, we can transform the ab 


k ava NI a eS 
TiiATiONnn 

AC st OT 
viii 

ae 


J— ee en bait J (13.14) 
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rtant to note here that if the 


kd 
4 Fat nt ene | Boal is 
y ge variable in equation (13.14 S of the f 
: en ies fix) = Vi + ad” > v; ) Because if dr Bg Fix) <v; th 
Bt L 0 then o roby not atisty 0 and dt > gut i8 the 
ie W fd > 9 and d; = 0, then we get f (x) “ SStistying the set 0 then equation 
ie ile ni al. Thus, f:(x) < vi implies ah ta =p ‘4 UP goal fi(x) <v 
ve st Similarly, fi(x) = vi, gives = “i '$ an undes t onsequent] eke: 
set pe ize lt- Similarly, ji = Ut; S1Ves d; as an undesi lIrable Variable . y, achieving 
ite pi as an undesirable expression, i all si Indesirable variabl e and we need to 
S +d: : : the Š ‘ Situations e, and f i(x) = 0: vielde 
illy 5 le Jexpressio® in the goal and then attempt i WS first identify th i yields 
tt? ive deviational vari Pt tO minimiz © undesirable 
ative riable wł ae eth 
sit as reasons . the positive deviati 5 
obvious evlational variabl 
2+ fof recall the MOPP < 
a Min. £2) = (fi(x) 
nd XES rr fp(X)). (13.15) 
tor 
ose the feasible set S is describ : 
"a Supp : on , ed by the m inequality constraints ¢; 
all il __.,m). some O e constraints could be > or = ipe aints gi(x) < 0 
‘je =< or= S 
“ne the form of the goal function a ! 7. 
= Since eon ne nd the constraint function are alike, both are i 
of he form of mathematical inequalities or equations, goal programmi ; e in 
raint functions also as goals. Thereby the constraints t tea Fe eats UKs 
nie Oe S too are converted into equations 
a sing negative and positive deviational variables. In oth 
fi H = . er words, constraint g;(x) < 0 is 
ot | pressed as gi(x) +d; — d; =0, dy, dy 20. 
lè To solve an optimization problem, we must ensure feasibility of the problem. This is 
ar complished by assigning first priority to the goals representing the actual constraint 
- fnctions. Once we attain this priority, we have a feasible solution of the problem in 
3) Wand. The actual constraints are therefore termed as rigid goals or hard goals whereas 
the goals representing the original objective functions are called soft goals. 
1g Summarizing the above discussion we conclude the following procedure for the for- 
al mulation of the goal programming problem (GPP). 
JY ; eh ee - 
-L Specify the aspiration level for each objective function. l 3 
i -2 Set up the goals and convert them into equations by using negative and positive 
deviational variables. ene: 
: E e 
a i, Treat the constraints present in the problem as goals and convert them into S4 
tional variables. 


| by using negative and positive devia 
s Assign first priority to the hard constraints and rank 
2 their importance specified by the decision maker. 
: i entity the appropriate undesirable deviational varia 
en eviationa] variables and make an attempt to minimize ; 
! priorities, i multiobj 
" the task in point 5 is accomplished by construc = ane 
wn ing p roblem and by defining an appropriate 


kk n 2, 
ba 
ad 


all other goals according to 


les or expressions involving 


hem in order of the specified 







ective pro- 


s ; e. 
eevamuie illustrates the above procedur 
LJIG LIYE ars i 
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le 13.5.1 Formulate the 


Examp 
ars. Mi (2X1 — X2, Ax, — 9X2, —X1 ) 
in 
bject to 
= 4x, +5x2 <S 20 
3x, +2x2 < 1Z 
v1.42 = 0. 
i ior] d objective crita. 
T first priority to the secon criterion 
i decision maker give 
Solution Suppose the 


ee. the third objective criterion and third priority to the first obje 
second aa, to keep the first, second and third priorities ob winapi 
Be rescctavely So, the goals in order of their importance are given by 


Ax, —5x. <8 
yi 2 
2X1 =X. S Ik 


Ctive 
below 


Introducing the deviational variables both in the actual goals and the two constraints 
(hard goals), we get the following system of linear equations 


A aE ce OE — EDF = 2() 


3x1 2x hd d = 12 
4x1 — 5x2 +d, -d3 — 8 


(13.16 
XI T dy = di =) 
2x1 — x2 + dE - di == ||| 


x5, a5, d* Oj = 1, 2)G= 12415,5). 

The undesirable variables are, Bae, dz 
Our next task is to construct 

decision maker. The first rank is r 

ond objective function, third obje 


, d}, dz, respectively. 

an objective function depicting the priorities of the 
eserved for the hard constraints followed by the set 

ctive function and then the first objective function, aq 


























ee te - +) in thi 
respectively. Therefore, we make an attempt to minimize (di + d5 ; dz dz, dz) irte ‘ 
order only. Thus the GPP becomes 
Mi $ AE AS AF 
n (di + d,, a dy, d< ) 
subject to (13.16). 
a eu ove he ve yet to specify the meaning of ‘minimization in order’. | 
pe oe i, eee f 
WES a O Á aor P n which the objective criteria in terms of the $ 
a les Ore ranked accord T 


NC a y rdar = wr. 
Oi Ad 4 : 

4CCOT: N NG kag 

ru fp ad. 

TE Anand 


ot lewr importance and the minimization P”, 

ter. We sir uply can not bypass this order: = 
-ONC epts of weal a efficiency, efficiency or Ae 
of the o jective criterion is immi 
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13.5-4 (Lexicographic Minimum Vector). A yr we 
. A vector w™ e RË is said to 


jon 
o perel 1 nother vector ERI fw =w (i=1 
For example, if w®” iá j “= 1,...,k-1) and wD < w” 
P k (k Bi P) (2) p * uy e” P (0, 14, 2, Ws 25) and (A K k p JOT 
mena) is preferred to w®. If, in some set B C RË, there i w) = (0,12,5,11,27), 
yen then w* is called the lexicographic minimum prac ag E” preferred to 
i set B. 


+é 

worth tO note here that in choosing the lexic ic mini 
K candidate vectors, we search a vector with E cont onect pe 
Pice is unique, we stop, else i SENN to choose the minimum second at 
7 į altering the minimum rst component. Repeat this procedure till we get a 
“ique vector with first k minumum components, k can be equal to p also. The procedure 
“ ewhat similar to searching an english word in a dictionary. The minimization 1 
the sense of lexicographic minimum (Lexi-Min) ii 
al model of GPP is described as follows 

Lexi-Min Fa aa Fx(d-,d*)) 


pP is taken in 
Thus @ gener 


subject to 
fi(x) ale d; A di — U7 (1 we il, ,p) 
l = AT = AE 
gO) + 4, d = Ol lnc lit) (131) 
a akan 0: C= epg y=), 


ber of priorities specified by the decision maker. Remember the 


where K is the num 
frst priority 1s reserved for the hard goals. Also, jee) = 1,...,K), are linear 


functions of the deviational variables. 

fall the functions f; and gj are linear 
GPP (13.17) is called linear goal programming problem (LGP 
discuss the solution methodologies to solve LGPPs. 


functions of the decision variable x then the 
P). In the next section we 


13.6 Solution Methodologies for Linear Goal Programming Problems 


We briefly present two traditionally know? techniques tO solve LGPPs. If the decision 


variable x involved in LGPP belongs to R? then the problem can be solved by graph- 
ple 13.5.1. For convenience, We 


ique on Exam 


eal techn; ; 
e “a technique. We illustrate this techn 
ATNtTOdire 

'oduce the example. 
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. = ad 
ee | 
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aji E dt +z, 43, ay, ae ) 
1 Solve Lewvi-Min Ce 

Example 13.6. reg e | 
Gi: 4x, + 5x2 + d3 — hi = 20 2 
Gs | 3x1 + 2X2 + dz — dz -12 l 
G3: Ax, — 5x2 + d; — d3 =8 
G l VG) oe Az — di =? 
Gs: waat d; -di =1 


i e G 1,2) - (=D 
(13.18) 


deviational variables, the five goals are plotted as straight lines 


xj, d; 


Solution Ignoring the 
in Fig 13.9. m 








Fig. 13.9. 
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asible region, we move the objec- 
the feasible region. The 


ie final priority goal Gs is outside the current fe 
achievement value of the 


= : i O meet 
e line Gs parallel to itself in the upward e o 
“Xeographic optimal solution is x* = (2, oe a positive value 0.6 in the end ee 
ar 3 e jeved. 
ective fun 17) is (0, 0, 0, 0.6). The Post Ny» can not be à 
“lat th ie n z 2) 4 a the objective aS T objective was at least 
~© aspired value ‘below value below 1.6 But since this objet" 
! Pe Fee. : ue ae 
an bring down its mise is reas0 

ision maker, so, the comp™° 
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"aar programs, one by one, wh 
Iving K number of linear | B ; , where K iş 


h a manner that the lexicographic order is maintained pee e 
1 s which have already attained their optima] val, 
s in the subsequent linear programs, The is 


‘uta f(x) = ay, where fi are Auter 
is ensured by adding additional constraints, filx) =A W fi those objectis 


5 f Å! * s . 
functions which have already achieved their optimal values 4; , 1n the subsequent linea 
cti ay cause substantial increase in the number of constraints part; cule 


Luckily we can reduce the computational effort by Progressive 
ding to the following rule. | 

c variable that has a negative opportunity cost 
be assigned zero value in the subsequent An 
olumn corresponding to this variable can pe 


The scheme involves 50 


number of priorities, in suc | 
the same time the objective function 


should not deteriorate in their value 


programs. This m 
in a large size GPPs. 
dropping some variables accor ! 

Column Drop Rule. Any nonbas} 
zj —c; in the optimal table of a LPP can 
programming problems and therefore the c 
dropped from the subsequent linear programming problems. 

Implicitly the rule states that if a nonbasic variable with negative opportunity cog, 


Zj—C; is introduced in the basis at the later stages of the algorithm it will degrade the 
solution in the Lexi-Min order. 

The algorithm can easily be understood through the following example. 
Example 13.6.2 Solve the GPP (18.18) by the column drop rule. 
Solution To begin with, we solve the following LPP 

Min di + d; 
subject to 
4x1 + 9x2 +d, -dt =20 
1 1 
3x1 + 2X2 +d; ids, = 2 
Xi, d-, d* 10) ee 2; 








The optimal table is given by 


h 
See A0INNS can ie ahs a T ° hence according w ja 
vale ata opped from the subsequent iterations: 


Value o; the T.P) 
ae VSL è Dy pa f 
eee is zero. 






evel LPP j 


E are respectively given by 
P ne Hibs 
-ETN 


DDF er 


d ‘el EL 
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Multi-obj 
Min Theory and Methods goa 


subject to 


4x, nF IX» + dy 


=2 
Re e = 12 
4 ue 5 = 
1 9x2 + d7 — d$ e 
Xj, di, d* 


20, G=123(i=1,2,3. 





The optimal value is zero. We then move on to construct the LPP corresponding to 
1: jority objective function. 
. third priority © 7 
the Min di 
subject to 


4x + 5X2 + dy = 2() 
3X4 + 2X2 +d; =e. 
4x, — 5x2 + d3 — ts 
2 6 +d, -di = 
yy OF d* > 0 (j =1,2) G@=1,..-,4). 


The optimal table is shown below. 


k- ; ted by 
p En j : re resen 

The final priority criterion ÎS Ny 

“olution table is subsequently 8°24 


ptimal 







the following LPP, and its © 
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subject to 


4xy + 5x2 + dy ir 
Bx + 2x2 + d3 E 
4x, — 5x2 + d; a 
xı - d} = 2 
-x+d; -di = 
2X} 3 5 Ji >0 TEL: 1S 
E i aa 


+ 
di 





h p pi al i l i 





Zi — Cj -1/5 -14/5 
The optimal value of the LPP is 3/5. Thus the lexicographic optimal value of the 
GPP is (0, 0, 0, 0.6). 


The optimal solution of the three ob jective LPP in the Example 13.5.1 is x = 2 ma SAE 
and the objective value in order of importance is (-0.4, —2, 6). 


13.7 Summary and Additional Notes 


e Section 13.2 provides 
the basic issue of trad 
e Section 13.3 describes 


glimpses of multiob 
e-off between the O 
various solutions 


jective programming by briefly discussing 
bjective criteria. 












concepts related to the multiobjective pro- 

&ramming problems, The concepts are well illustrated through many examples. For 

g “ore reading refer to the texts by Jahn [82] and Sawaragi et al. [140]. Saar 
£ The apo; Aommonly used technique to solve a MOPP is to use scalarization and oa 

a ett MOPP into a scalar Programming problem. This approach is explained m s 
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. swine research articles are devoted to study the KKT 
| @ : the duality results lor honlineny Me ype . t 
B wn 


l We eita Í 
had been novel in their appronch, the the ra 
which ond (163), Chankong and Haines (381, Gharns 
om omet and Mond [85], Singh \1 44), Hanson (7t 
above references we have restricts 


‘Teeny Wii Matiria “UK, 


YPE (nile Y EAn 
© mily thine Cnt nitions 
works of P, Wolle (tey Wess 
wd Cooper (#4! Pando WA, 
i|, to nama a few W hile rning 


Wl ourself to dith 

` i ! TeTitinthe meth ities (jne Cats 
w to find many interest) n ; | if 

apviously go dee] | ng results for MOPP in Niner verhons 


Myolution ary algorithms have been successfully used to solve MOPPs, The primary 
è n for their success is the ability of the evolutionary algorithme to PETUA Ble 


ons in a single simulation run. A brief description of few such algorithms 
Ayon later in this book. The beginners in the field of multiobjective optimization 
is gv" jutionary algorithms can refor to a good text by Deb |46). 
with Ea research in last many decades resulted in various techniques, like, multiat- 
E ay analysis, normal boundary intersection method, multiobjective heuris- 
Ba rithms, among others, for computing efficient frontiers of extremely complex 
ue a iea ‘oblems and decision n 
engineering design pi ob! 


aking problems. However, none of these 
a perfect and selecting among 
‘ayes are perfect and 

techniques 


them depends on the requirements of the 
i í f = é i i 4 * i i Y i 
blem. For more details on the solution methodologies, we suggest a wenste ded- 
enl. l 


‘a mx/~ccoello/EMOO. 
. tala tant} OPT ing: htt | www dania mz] ceoello/ E! 
l tive programming: NLUEp | 
icated to the oo n a MOPP in finite dimensional real apace. However, 
 , Inthis Chapter we have 1ocussea OI wi v abstract spaces have heen investigated 
| teation problems set up in Very abeuiew | + the tuii 
| yector optimizatio’ P | cellent texts are available to get the mnsig® 
in detail in last many years. ng € h i For instance, one can refer to, Borwein 
P l ` 1 ) | OA ' d ‘ 2 i = 
af the otherwise extremely si : ‘06 Steuer (148), Sawaragi et al, (140). Fur 
t uenberger (100), SUEUSE Le l araooth nonlinear 
Luc 105}, URU RAI ' aling with nonsmoo 
(25), Jahn [82], 3 "Sabi articles can be found dealing 
thermore, numerous Te | 


MOPPs. 
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13.8 Exercises 


13.1 Consider the eal f(x) = (%1 + 9% x4 | 


subject Lo 


e2 

xı — 2%2 
Ox, + X2 <9 
X1, %2 2 0. 
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; take 
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J ven easible solution 18 € 


ther each 
į) Determine graphically whet ya 
ition of the problem: (2, 0), (4,7), A a BE tec values space K d 
(iit) Identify the efficient frontier of this | = 
: ee | 
A the above exercise for the linear MO 
13.2 Do the a seco why 


Min 
subject to 
—5x1 + 2x2 S 10 
Ltn 23 
xı +2% 2] 4 
Xir XD Z 0, 


and points (4,0), (2,1), (3,3), (1,2), (5,0), (0,0). 
13.3 Let S = {(x1, X2, x3) € RÌ : z g 05) Define a function f : S > R? by 
Ff (X11, x2, x3) = Ge — Xx], Xo RA Z). 


Is (0,0,0) an efficient solution of the MOPP- Mi 2 GG 
f : ae f(x)? Give reasons for the answer. 


13.4 Let S = [0,2]. Define f : S + R? by 


fla) ={ O@-—11—x) “x elo1 
(Ca), xE Kar 


Find all the weak efficien i 
tent solution 
ming problem Min Hie): "and elficient solutions of the two objective program- 
MES 


(i) Maz (x1 + 2x5, ~2oei Alpe 






i 
nA, ai 
f Diis 2 
jE: = 
i — = 


` 4 e ae C = 
* ig i = 
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(itt) Max ( 4x, 


subject to 





*1 x60 

x2 <40 

“1,X%2 >, 

ma the weighted sum approach, 
„g Using the 7 , find the effici | 
= pve nonlinear pr ogrammnung problems, iRetent frontier of the f ollowing mul- 
‘ (i) Min, (2x1X2, X? 4 x2); 
xER Ah RE 
7 2 
(ii) Min Es ties 3 — x4 + x3} 


subject to —3<x,, Xo <6) 


13.7 Gonsider the following MOPP 
an (Ar + x2, x5 +92} 
2 
subject to 
xe + % <D) 
FIG ate X5 > I 
Xa 
Xi) S 


Find the efficient frontier of the problem. 


43.8 Consider the following MOPP 

Maz (x1, —x1 - x5} 

subject to 

1G —x, <0 
X,+2x2 < Or 
Determine all the efficient solutions of the problem (A ck 1 Ee 
(5) and (3, 4). Is any of the generated efficient solute 
of the problem? Justify your answer. 
13.9 Determine x = (x1, X2) $0 as to 
(1) Lexi - Min 


subject to 


(dt + d3), (43), (di), (45)) 










> = 80 
ay, + 5x2 + dy =f; ; 


Ax, + 2x2 + d3 i: 

gx, + 100x2 + 43 -f3 
gam 

x, + X2 t a 5 

xX,U s 


— 
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a  Lei-Min Idi) (d5), (d3), (di + 43)} 
subject to 
2x, +x2 +d; -di = 20 
ETF d5 - d5 = 12 
X2 + dz _ d3 = 10 
moe ,o 20, 
(iti) Levi - Min {(d; + dy), (2d; + d3)) 
subject to 
xı — 10x2 +d; - d = 50 
3x1 +5x2 +d; - d; =20 
8xı +6x2 +d; -d3 = 100 
ene > 0. 
(iv) Leri - Min (a; to, A) } 
subject to 





¥1—xX2+d,-d; =10 


2x1 tOta — dz = 26 


2 
—X] +2x2 +d; — d} =6 
a >o. 
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14 
semi-definite Programming 


—— rr 





Ad Introduction 
this chapter, we focus on a special class of nonlinear 
semi-definite programming problems (SDPPs). This ae ee problems called 
. ° i SC s 
tension of the class of LP Ps wherein the data in the a ae as an ex- 
constraints 1m such optimization problems involve comparison ee ck k The 
: T 

set of symmetric pees: For this purpose, the set of symmetric nee pr n 
with the positive semi definite partial order relation. The resultant partially fe 
of symmetric positive semi-definite matrices possesses many promising properties A 
eventually lead to efficient solution procedures for the class of semi-definite programming 


problems. 


| 142 Motivation 


f the main problem we lay the foundation of the 
le passing from the linear programming problems 
tothe nonlinear programming problems, it was very natural for us to allow one or more 
than one function appearing in the linear programming problem (LPP) to be nonlinear 


function of the decision variable x. This kind of nonlinear programming problem model 
is described by 


Before going into the technicalities o 
subject and get familiar with it. Whi 


Min f (x) 
subject to 
| 14.1 
gi(x) 2 9 (i=1,...,™), (14.1) 
us chapters. There 1s another way to in- 
ay not occur very naturally to us 
nt functions one can 


d the constral : 
r relation ‘> as nonlinear. 


and it has been a subject of study in previo 


Pitts nonlinearity in LPP model. Although © m 
a objective function an 
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2 $0 roblem 
ation p 
near optimization 
Consider the ponli -a må 
subject to 
4x,;xX2 2 1 
xı + X2 > 0. 
The problem can be equivalently expressed as 
Min X] 
subject to 
V (x4 —x7+1 < Xt X2, 
i.e. 
Min x4 
subject to 


(x1 — x2, 1, %1 + %2) € S, 


where S = {(01, w, 03) € R? : 03> o + U5 i: 
If we consider the problem 













Min X2 — X1 
subject to 
ae: a2 
2 =—% <=} 
then after simple calculations it can be seen that the problem is equivalent to 
Min W = 
subject to 
1 0 X1 
Pt x2 E Si 
a y 2 


a 2 
X2 Xj = S2, 


are respeci ivel - j 
E. ely, the sets of 3x3 and 2 x 2 symmetric positive sem 


Snel 


Wiis 


wm SR T nonlinear constraints are converted into p 


e result ilting 6 equi iva. lent problems thereby P 
Pn <d re osiot | 
n the egg = ope 


SA A 


ASUI 


yy 


| Ss 


r 
DR’, 


_ At this sta8® 
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; i Semi-defin; 
wonder as to how to identify an appropri mte Programmin 


gi can be converted into a linear n 
aed uring the course of our discussion F 

jei red skill to convert a nonlinear a k = ie being we 

pe ate set: What we would like to stres 

near programming problems in c] 

a are virtually LPPs over some 


ble 


a assume that we have 
‘0 Nena over an ap- 
e Considered problems 


the corre 
rresponding equivalent 


set. If we 
f we relook at the usual LPP 


Min cl y 
subject to 
Ae =p 
x> 0 
(14.2) 
it can be reframed as 
Min chy 
subject to 
AX Die St 
-xe R’. (14.3) 
In other words, it is same as asking to solve 
Min ole: 
subject to 
Ax—beS, 


nice A= ( y ); b= | ; ). and S = R” x R}. 


is a linear programming problem over the set 


the earlier explained examples are similar to 


f the usual LPP, the set is the product of 


cases described above, the sets 
one in three dimensional 


emi-definite matrices. 
LPP and the above 
nvolved in the 


5. The equivalent problems formulated in 


LPPs but over the different sets. In case © 
nonnegative orthants R” x R} whereas in the nonlinear 


trically an ice-cream © 


he sets of positive S 
between the usual 
cture of the sets 1 


| 
| 
Thus, one can say that the given LPP 
l 


are, 4 ( | 
5 lor 02,03) : V3 Z v ag U5 \, geome 


Euclid 
: aga space, or $; X S2, the product of t 
ee 1° therefore conclude that the basic difference 
; ated Na les = . ‘ . stru 
2 Saas programming problems lies in the 

Nstraints sca 
3 is ALU. E > linearity in the 

ENS - that B preserving 
optimization 









build a new model of nonlinear 
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ge few potential questions. Can any ‘ae 


Ld thik bviously emer i h 
Having said this, there © S require some kind of meaningful str 


ibe s be useful? Does , ne‘ 
to describe such models a solution methodology for such class of optim: tte 
which in turn helps to develop : 


‘ a) j s 107, Par 

ay sting purely from the theoretical point dite 
| s kind of modeling intereste Views 

— Sveti in practice be casted into such models? What are the f ew? 

Can proble 


theoretical and computational results in thi 
questions in the sections to follow. 


plications 





Xist} 
s direction? We shall be addressing i na 
Ese 


14.3 Cone 

Observe that the inequality Ax > b in (14.2) has been expressed as Ax — } e Rm . 
i i m. What i jal about R”? a 

(14.3), R” is the nonnegative orthant of R”. at is special about RY? Let us foey 


on this issue. 
The inequality ‘>’ in R” means that for a, b € R”, 


azb 8 a—beR” = 0h 2 lo = o iik 


Thus ‘>’ is the coordinate-wise ordering of vectors in R” that satisfies number of basic 
properties. 


(P1) Reflexive:a>a, V a € R”. 

(P2) Antisymmetric : a > b and b >a >a =b, Y a,b € R”. 

(P3) Transitive : a > b and b > c > a > c, Yab CER” 

(P4) Homogenous : a > b and À > 0 => Aa > Ac Va,be R”, AER 
EDM Additive:a>bandc>d>aq+4 c>b+ i Y zb cd ER” 


A , CSN a 
m. pe pee: make 2 a partial order relation on the set R” which is com- 
~~ mear operations. We hardly notice this while studying LPP yet it is these 


ae: features to LPP. These features are implic- 
we would certainly like to ask Gg ‘>? and we need not repeat them. But at this moment 
Z the only possible partial order that can define the 


vector inequality Satisfyj 
| ng the above fj , 
leads us to the notion of cone. Properties? Thewiestto answer this question 







_ Definition 14.3 
an a (Cone) A no 
ge a KAERA >05 Ak € K. Pty subset K of a vector space X is called a cont 


v * 
7 re, "2 d 


a a e 


(A 
J 
Ra a 
OL OKA mp; a AY a ei ; 
eL the sets 
e P eit l K = ^ 


Sen 





as. 
É 


a 






Ea 
f my 






+ 





a 






(xi, x2, x3)ER3: xı > x220, x370 
aay ae S o 


~ = 
a) eae = 


= ~a t 
A LAs 3, 
A N >% i 

~~ i l 


D- ~*~ it 
TIAR > Sa A 
< - 53 
n —~, NS 
ion 4 4 
f l > 
7 ‘a 





ee: 
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e ion 14.3.2 (Convex Cone), 


i 


i . verified that all the 
easily be the above described : rogramming 513 
; E 


nit 
l oa a 
IE A conver cone if K j 
& is a 
gemar k 14.3.1 It is obvious that K is a c 


| 'ONVET cone ; 
ther words, K is clos ‘ Cone if and i 
eK. In o ; S Closed under Aditi ta only if for 
th T addition, raul ky, kek 
J 


p cones listed above are convex cones 


Th However, the 
) ťi f l] : ; 
yë Olowing cones are not convex 
EL 2) E R? : x, =O} ut 
(i) K K (x1, x2) € R2 ; si). 
iy K = (122) ER? : Ixil < Ixl}; ait 


= X1,X2,X3)ER°?: x ap a 
ii) K {( 12220, xa = OO x2, x5) €R3: x5>4, >0 x2 = 0} 


ition 14.3.3 (Pointed Cone). A cone K is said to 


Defin be a pointed cone if KA(-K) = 


hie. ifke K and -k EK thenk =0. 
(i) K = {(%1,%2) € R : x2 > |x|); 
(ji) K = {(x1,%2) ER: kiso 
Gi) K =-{(AcR™: AJA x Av 20, VxeR"); 


1 
| 


f . . 
are pointed cones while 













4 


=e a pointed, while (P3), transitivity, 


>A a 


M K = {(%1,x2)€R* : x1 =0); 
(ii) K {(x1, X2) E R? s IRZ O} U {(x1, X2) € R? ESO Heh) 2 0}; 
Gi) K = {x= {Xa} : Xn =0 for all but finitely many n}; 


are not pointed cones. Geometrically, a pointed cone should not contain a hyperplane 


passing through the origin. 
Let K be a non empty pointed conve 


i m 
x cone in R™. We induce a new order in R” by 


ach =e K. 
perties (P1) to ( 
and (P2), antisymmetty, 
holds in lieu of convexity ° 


2K C. 
B ero laa 
4 >k b and b -bek bec 
2kUVandb>xkc@a / we 
= he ae 
AN : s from t 
hne property ( 24 h mogenelty, follow : 
sty of K. uiring 42° vrai esiOme RON 
O convexity O lpia shall e req in the discussion. 


a parva non Cr 


snc ATIL 7 a 
OLIS Curae mem o 
è 


p5). The property (P1), reflexivity, 
follows on account of K 


This order relation satisfies the five pro 
K as 


lows as 0 € K by definition of cone, 


nition of cone and (P5), 


ortant 






= TT 
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4 
51 {0} is a pointed convex cone with a 


2. x5 > [X1| } U 
EK ? X2 Dems i i 
- d cone, however, K= ((x1, x2) ER: %1 = x22 O} is a 





| e 
interior but it is no : m, 
closed convex cone wi 


sarizing the above - 7 
r re hat is more important is the properties that this set possesses, 4 ; 
ot gg convex cone also possesses all the desired properties and thus Can be 
emp 


a meaningful replacement of R™ in LPP (14.2). Thus, a general optimization Problem 


t a close 


th empty interior. l 
discussion, what we find is that the set R? in Lpp 


involving cone 1s modeled as 3 
Min Gx 
subject to 
Ax-b>x 0. ia 


Problem of type (14.4) is called conic programming problem (CPP). From now ane 
wards, we assume that the cone K in (14.4) is a closed pointed convex cone with non 
empty interior. LPP falls as a particular case of CPP with K = R}. The other two most 


interesting and widely studied cones are 


m-dimensional Lorentz cone: 


m => 


= {(x, Xm) = R” 2 Gis. > am a); Xm = bal ps }. 


aa is also known as ice-cream cone or a second order cone. Por m = L MSR 
an orm = 2, LE = (Q2) € R? : x > |x1|}.-b J ! 
2 : X2 2 |X| j, both are polyhed m 
is no longer a polyhedral cone. polyhedral cones. For m > 3, L 


Cone of positive semi-definite matrices: 


SY ={AeR™ . A=A"™, A>0)} 


J 


Where ’ >’ notation i 
Thus, when we sie ar a aes the partial order of positive semi-definitenes. 
_ One can easily ei that mean A is a positive semi-definite matrix. 

| at both the Lorentz cone and the cone of positive semi- 


defini 
Ointed convex cones with non empty interiors: 






peiihite matrices are non empty closed p 


eee = 1(X, x ‘ale 
eee EER" : x > Idb}, 


G WEEST TEA 
ea | m_ Fee y san Vay p 
4 we neren J Al a 
A> = f m Np =>} ia ate Á 7 
7 4 ee ee UC} ‘ F 
TSS w sA 
. & 
k 7 = e 3 
R; rú 
a 
i * 
. 
ib, o sid 
R 
- 





zá 
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, Formulation of Semi-dery: 
; 3 44 For efinite Program 
4 ent certain basic concepts relate 


pace structure imposed on it. 
s pet 5” be the set of m X m symmetri 





, c 
ation of a matrix by a real scalar, gm 
with dimension EEGs. As in R”, the inne 
dot product, i.e. (a,b) = a'b, here, the in 


using trace function. So, for U, Ve S" y 


Matrices 
ihe matrix addition and multi- 
ng mite dimensional real vector space 
uct of two vectors is defined as their 


n 
er product of two matrices in §” is defined 
e define 


(U,V) = Tr(uV) = Tr(UV), 


where ‘Tr(-)’ denotes the trace function and it is defined as the sum of diagonal elements 
of a matrix, e. 


m 
Tr(U) = Tr( [ui] iat) = Yui 
i=1 
This inner product is called Frobenius inner product on S". 
We shall be using the upper script letter, like A, to denote a linear mapping from 
R” to S” and Latin letter A to denote a matrix in 9”. 
We are familiar with the linear mappings in R”. Generally, a linear mapping 7 : 
R" > R” is described by 
T(x) = ayy + ann; 


Me UN (P= E nit) 
here x = X1, ---,Xn) E R”, and a; E R (=r, | | 
ý For MA F. R2 — R? defined as F (x1, Xp) = (Ky + X2, X41 = X2, x2), is equivalent 
to T (x1, X2) = 41X1 + 42X2, with a, = (1,1,0) € R? and a2 oe = i 
On parallel lines, we can specify a linear mapping A: 


Alx) = Axi + + AÅnXn, 


R? > S* be 


] TA: 
with x = (x1 xn) E R” and A; € 5”. (= et. mt). Hor example, le 


defined as 






gti %2 | 


0 -1 
1 0 -| \x 
F tet 


= Ayx, + A2%2 t A3x3- 
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516 Numerical na x, x + x2 
0 X2 
= X1 
A(x) xı + X9 x? Cy aana X? 
soy 1 a ft 
= 1 0 QO |x T 0 0 1 wy) 


SAXI + A2X2. 


o describe the optimization problem over the 


i køround we are ready t 
With this backg A general model of such conic optimization 


cone of positive semi-definite matrices. 


problem is given by fs 


Min Cx 
subject to 


Ax B ese! 0. (14.5) 


Here x € R”, cE R^, BE S™, A:R” — S" is a linear mapping, and S™ is a cone of 
m xm symmetric positive semi-definite matrices. Moreover, when we say U >sm V we 
mean U — V e S™. In view of the above discussion on the linear mapping A: R” > 9" 
problem (14.5) can be expressed as follows 
Min CHX 
subject to 


AiXx1 Feee Sy Aan F B > gm 0. (14.6) 
If we write, F (x) = A1x1 +- - + AnXn — B, the problem is 


Min cT x 
subject to 


F(x) >sm O. (14.7) 


= The inequality F(x) >sm 0 is called linear 
a ation problem (14.7) is called semi- 


a 


matriz inequality (LMI) and the comic 
ee definite programming problem (SDPP). 
: ee 1-41 (i) As a conver combinatio 


i Dry YH = vad oy as 
LLUC SEMI.: oi ERN pi s 
JU Tibi- vit "O 4 
TCO-ACTUNILE, ¢ 


dá 
S j Segu i 7A de Abi > 







/ 

i 
FAnn 
j / 


TE y R > , 8- 
n of positive semi-definite matrices 1s pO 


3 Asm U, Ae el => F(Ax+(1- A)y) >s s 
i Sia ; 


a = a 
NTrnhisoenr ~) 
TOUCH, ae. - 
7 ai SA j of the 
olution x* is on the boundary 
d - 
Dts 
f 


a4 r 
$, / ewe 
uf oe 


~~ 


>) = 
G i 7 
m = 





d 
= "3 
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: 14.4.2 While defining SDPp 


| get should not worry about more 
it fact from the fact that the system 
pt 


for this, CoP 


ope om, BeS™ G=1,...,4), 


i 
a 

Ay (x) On xm 

Omxm A(x) 

. Ompxmy Om, xmy 
| B, 0 
O By 

Bo = ; 3 

0 0 


i 


Then the system of k LMIs, Ail(x) -B 
IMI, Ao(x) — Bo #570 0. To illustrate this 


a INRIA ES S 


























LE 2 xı — 3x2 — 4 2 0, 
1 iaa PO) = 1 2X2 Š 
Ky = x2—3 bt taa Jaso 
2x1 +x2+1 RI ; r; 
. : I given Dy 
The above two LMIs can be combined into a single LMI g 
0 
0 
=X, + X2 x1 — 3x2 - 1 ; 0 [>s n 
ay 2y] A y -x-1 2x4 4a T 
x 
a ‘ 0 x, +%2+1 z 
i LMI only- 
7) with e eformulate 
a. À to study spPP (14 ) the principal tool tor 
_ This observation leads us vis often deed a5 : 
T le nex result is very important an propriate SD 






ae RER RTN ino 
| Chavet. 3 
A0niinear programie 
aaa eo > 2 AE Rko T S 
| 






~ 
— 


(14.7) only one LMT jc ; 
th ts UMposed as a constr 


. LE 1 ‘COL! $ 0 


A(x) — B 


-+m. Define a linear mapping Ay : R” 


result, consider the 


problem as an a 
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amt, 
n for 


initely man, 
y many LMIs can, easily be converted 


sider k number of LMIs as follows 


i > gi 0, 


=> 5™ and a matrix Bo € S™ 


On, XM, 
Omxm, i 
= Diag( Alx), -- Akl), 
A(x) 
0 
= Diag(Bi,..-, Bx). 
Bx 


_,k), is equivalent to a single 


F "i 0 (i = 3 oy 
1 following system of LMIs 
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518 Numerical Optimization with Applications 
be a symmetric (1+kK)X(I+k) matriz with : 
rk b 


Lemma 14.4.1 Let A =( C 
and 1X1 block D. Suppose M is a positive definite matrix, The 


nite (respectively, positive definite) matrix if and only if D~ ; lee 
definite (respectively, positive definite) matriz. Mcr 


definiteness of A is equivalent to saying 


M, IX k block C 
positive semi-defi 
is a positive semi- 


Proof. The positive semi- 


Ji 
o MC 2 (3 2° vxeR’, yeR’, 


1.€. 
x’Mx+2x'C'y+y'Dy20, VxeE R“, ye R’. 
The above inequality yields 


inf (x?Mx+2x'C? T 
ns yt+y'Dy)>=0, VyeR. (4g 


Now, for every fixed y € R! i 
) > t 
-M~*C'y, and the ae a 5 cc ee problem is attained at, 7* = 
that the positive definiten iga (D-CM" C )y. It is important to m 
“4 ess of M is used to infer th * j : ro 
problem (14.8). at x" is an optimal solution of 
Thus, the positive j- : 
alent to y’ (D — CM-icty . i wes vectivells positive definitenes j Í 
R’, y #0). In other kee a pF. (respectivei y‘ (D - paar: W 
a ) 1S same as saying that )y z 0, y y€ 
= + 3 
sik 0 (respectively, >g 0) & D-CMICT > 
TEE %sı Ü (respectively, >. 0). O 
-4.3 The matrir D-CM !CT i 


after its discor i 
ıscoverer, Issai Schur. Scaled Galit complement of M in A named 


14.5 Applications 


È At this +30 
"T eres S Junctur zi i 
aoee Gog Ould like to se 
LPI © what kind of probl 
p. ems can be casted in the fom 


ie \ /ONSIC PF eB 
mai ` Č <] y _ AE) aN 
=- wonsider the LPP 
Cima. een / A fi Jė 
á 7 Ps) = 7 P 
al r S 
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B = Diag(b) = 





~~O ER br, 
adlet F(x) = Li-1 Aixi — B. Then LPP can be expressed as a SDPP 


Min cix 
subject to 


F (x) > gm 0. 


For example, consider the LPP 


Min 2x1 + 3X2 
subject to 
x+% < 6 
—x,+2x%.< 8 
xX1,%22 O. 


the given LPP 


? 


that arise frequently 
SpPP is the class of 
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Min x7 Qox + co x + do 
subject to 





T a (), = 
pie het ta Se (14g rit 


definite matrices, Cj € R”, diER(i=0, 1), xe Ra 


T ositive 
Here, Q; € S” (i= 0,1) are p he problem can be converted into an Optimization 


By introducing an additional variable t 
problem with linear objective function 


Min é 
subject to 
xIOoxtcix+doS ¢ 
xTQixte,xtd,< O. (14.10) 


Here we take a small pause to recall from eigen value characterization in matri 


theory that if U is a symmetric positive definite matrix then there exist an orthogonal 
matrix M and a diagonal matrix A with positive entries A j such that U = MAM. Define 


A12 = Diag(A), and U/? = MAY?M!. Then U"? is a positive definite matrix ang 
j 7 


gae = u. 
From the above observation, we infer that there exist positive definite matrices P; (i = 
0, 1), such that Q; = P;' P; (i = 0,1). So, the constraints in (14.10) become 
CP IP = e ed C01), &=€, & =0: 


Applying Lemma 14.4.1, the two inequalities can be reframed as 







Tan Pox Onxn Onx1 

Py ep = a sg relay ee Oe O11 | 

Onxn Onx1 Inxn Pix ase o; Dua 
Oixn 01x1 ART Seis -dı Br. | 
Thus, problem (14.9) is equivalent to a SDPP. k $ 
Remark 14.5.1 a inequality of the form x'™PTPx + cTx +d <0 along with Lemma | E 

14.4.1 yield Px oo Ea 
E | xTPT  -(cTx+d) | >s 0, irrespective of the nature of the matrit P. | 7 
F Recall the example from Section 14.2 | i È 


ni PET TAAR Min Di K 
subject to 


<_ 
aa 










Ta 
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= (x Wg aT ER m 
: 9 l X? | = <0 
= ( x1 e) o ye its 
W)C ofa Jeo 
he following SDPP Programming problem is equivalent 
Min te =; 
subject to 
hee BOF 0 20.1 0 
Oia 0. 0.0 
Xi a e000 
0 0 0 0 1 X92 
0 0 0 0 X92 Xi 


Another interesting problem that can be transformed as a SDPP is 





Mi; (c!x)* 
in Ty 
subject to 
Ax >b, 


Re eee ee ee EO laia nr 


with an assumption that dTx > 0 whenever Ax 2 b. 


~ The above problem is equivalent to 
| Min E 
subject to 


Ax> v 
(Ix < Ed'x. 










an be reformulated 


A 


Us he traints € 
Using 14.4.1), the coms 

“sir g the Schur complement (Lemma ), 

SUMI in the variables x and € as 





> ea a! 
 Liy,yty is. 
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rovide some insight into the ability of SDP jn hand}iy 
K 


are no way exhaustive but P 





eas hha si alue blem that 
various situations. t common eigen value pro can be 
: t the mos se l TN 
To begin with we cat The problem is to find minimum of the maximum die 
tifully formulated cornet the problem can be Bye a8 follows. ; 
ar eg mi |, çm js a linear mapping. Find z” € RY such that 2% solves a, as 
uppos |. ae 
optimization problem etë 
pro 
where Amax (A(z)) denote the maximum eigen value of a symmetric matrix A(z), Note i 
T PO n s 
© Amin (Nİmxm — A) TZ 0 y] 
= nlinxm AZ) >y 9. | cal 
iS | 
Thus problem (14.11) can equivalently be posed as cf 
subject to i 
Another related problem is to me 
Min ||A(x)ll2, 1 


A:R” > R” *1 is a linear mapping and x € R”. Observe that here A(x) need not be 
a symmetric matrix. This problem can be casted as SDPP 
Min n 
subject to 


Mpxp Alx) ` 
A(x)! Moxa os ` 


One has to realize that though the original problems are not actually in the form 


> ry gm by intelligently using the Schur complement they can be modeled as 
a AHE d gnizing the Schur complement in a nonlinear expression is the key 
P 1n retormulating a nonlinear optimization problem as SDPP 


To a i 
be ppreciate the power of SDP we mention few more areas where the problem 





ete gees eslig in statistics, eigen value problems 1» 
tiems 3Y O } ' polyno 


es, T0 


~ 





= «a 
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A 


ilar, we would like 
cular, to skete} 

1d this is that later a complete mi role of Sfp in mach: 
à then and there that ua MAD ter (Chapter 16) is Pa py _ 
, problems. The principal ao ra 


` it the i ; 
n Problem of ' detail analysis of 
T g ) . - tp “in pattern 
terus by finding an appro pattern classification is to classify two 


i priate SO yar s 
hype plane is nonlinear., The sde al ating hyperplane. In many cases the 
em ig transformed into an optimization 


involving a quadratic objective tion * 
4 tic term. Bubseqnentiy, senate The ‘kernal function’ contributes to 
ra ? p 
on problem implicitly involves 


dra ieee th the primary optimizati 
re em oO mn : - ; 
oblem & the right ‘kernel function’. The latter is a hard problem to 
search called kernal optimization. 


d it has led to the emergence of new area of re 
ye problem of kernal optimization is translated into an optimization model one 
e problem of kernal optimization 


s, 
we tecotnmend the text 


E OSS SDPP very naturally. Consequently th 
essed to via SDP. This area is still very young and a lot is needed to be done 
= ying any conclusion. But our idea is to highlight that, besides the traditiona! 
Ty > is in the center of many emerging areas. Of course one can easily extract 
! a ation on SDP and its applications in various areas through the vast literary 
p == able on this topic. 


Fo snulation of the Dual Problem 


d to SDP. One of the most elegant 


* ical j relate : 
ae addr the theoretical issues aE” 
oe LP is the LP duality. Since SDP can be regarded : parz are 
al result in 1 to raise an issue that if such duality theory 

‘ore it is natural 


DP is closely 
Ezi. in this section that the duality theory for 5 
e shall witness 1 


differences t00- 
ky: heory for LPP but there are some ee al 

the duality theor 

nadine further we mus 


i ji g eed 7 i ; £ f 


RP. Then the 
) Suppose K is a non empty subset of 
E i 
4.6.1 (Dual Cone): 


Lee j "iig l > 0, Y k = K} 
PCL “ee ) K* E (v E R” ' (V, k) Z 
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For example, if 
(7) 
(ii) 
(iii) 
(iv) 
(v) 


In the definition to follow we 


R? then K* = Ri; > hen K* = {0}; 
((x,%2) € R? : 2f vi AA . 
((1,0),(0, 1)} then thea K* = {(v1,02) E R? : 01 = 0, v2 > O}: 


a Y R? Tor- 0} D 
a, yı > X2 2 0} then K* = {(01, V2) ER‘: v1 20,0, +> 0). 
Aji,A2 re i 


AA AAR 


assume that the set K is a cone. 


Definition 14.6.2 (Self Dual Cone). A cone K is said to be a self dual cone if K = Ke 


. 2 : 
For example, R” is a self dual cone and so is the cone {(x1, x2) E R^ : x2 > lxil} but 
(x), X2) € R? : x2 > 0} is not a self dual cone. pe 


Lemma 14.6.1 The cone of symmetric positive semi-definite matrices is a self dug] 
cone, i.e. S" = {S7)*. 
Proof. Recall that the inner product in the vector space 5” is defined as (U, V} = Tr(UV). 
So, 
C tes ETUV 0, YV e Si}. 

We first prove that (S¥)” C S™. Let U, V e S™. Then we can express U = MAMI, 
where M is an orthogonal matrix and A is a diagonal matrix with nonnegative diagonal 
entries. Then 


(U, V) = Tr(UV) = Tr(MAM'V) = Tr((MTVM)A). 


Note Thpt the last equality follows on account of Tr(AB) = Tr(BA). Now, y'M!'VMy = 
(My) V(My) > O y € R™, so, M'VM is a positive semidefinite matrix. Thus its 
diagonal entries are nonnegative and A is a diagonal matrix with non negative entries. 


Consequently, Tr(UV) > 0. Thereby yielding the desired containment. 
Conversely, let U € (59E. Then 


T(UV)>0, NV E S. 


Let WE R”. Choose V = wyT e 9” 


aar Mal E TN m Thus 
_ VES! With this choice of V, hen, x Vx = (wx)? > 0, VxeER 






w'Uw > 0 = U e S". 


E E Tr(Uww") = Tr(w"Uw) = 
i — MEER = à 








m 3 i- oa ica is” — COR K 
piletine the requisite an, 1: 
ase (RK — py equality of the two sets. f 
M EN T LPPE wa ha. : ecte 
a Yo Dave, (S™)* = Sm Tt is thus exP 


MOW On tha Fann e- 8 > O be 
~ the lines of LP duality. It is show? t 


» i 
I 
i : 


ì P a 


what we studied in LP duality: 
aon Cih 


= ` 7 
Q ~ 


T s 
= 


jf the 






= 
7 





1 


Scanned by CamScanner 





Min 


. cly 
subject to 
Ax> b 
shen the associated dual is (14.12) 
Max bly 
subject to 
. AT A= cœ 
: ie (14.13) 
, ao inspiration from the primal- 
3 Taking inspiration primal-dual models. ( 
qual of the following SUPP S, (14.12)-(14.13), we construct the 
Min oly 
subject to 
AO) eee © (14.14) 


Since the constraint is in the form of LMI it is immediate that the dual variable 
(or the Lagrange multiplier) shall be a matrix, say A. Consequently the dot product 
iA between the two vectors b and A in the objective function of (14.13) should be 
replaced by the inner product between two matrices, (B,A) = Tr(BA). Further, A! is 
the conjugate of the matrix A in (14.13), so in context of SDPP it is to be replaced by 
the conjugate of the linear mapping A, denoted by A’. Remember ee the cons 
inequality Ax > b in (14.12) is same as Ax — b € R”, and thus A € (R$) = R”. Taking 
into account that the cone S$” is a self dual cone, we have, A e S". Putting all these 
facts together, the dual to SDPP (14.14) is another SDPP given by 


Max Tr(BA) 
subject to 
AA)= € (14.15) 
A Asm 


m 
d(A) = c. For a linear map A : R” > S”, the 











We next analyze the equation A 
conjugate map AŻ : S” — R” is described by ‘3 (14.16) 
(Aa) = (x, AA), *© sl 


l T A $ = ee 
An important point to observe big aie 

iS j De lie | i rl 

nner product on the 

n er pro 


(A(x) ,f fi i. a i eit Anxn)A) (14.17) 
L L Ta A- Wh J a oa : 


‘nner product on the left side of (14.16) 
el 


a fm , 
IY Cm RA aA T 7 
nə” while the i 

femme URENA A 


; ‘lv 
Wa AR: 
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X Zr: 0 
unrestricted 


so 14.7 Duality Theorems 


In this section we describe the dualit 
oe y relations between the following pair of primal 


Min chy 
subject to 


Ax) sx B (14.19) 


and its dual SDPP 
Max Tr(BA) 
subject to 


Ai (A) = (yj =a) 


(14.20) 
A Fm 0. 


Theorem 14.7.1 SDPP is a symmetric problem, i.e. the dual of SDPP (14.20) is the 
primal SDPP (14.19). 


Proof. Writing the dual SDPP (14.20) in the following form 


- Min — Tr(BA) 
subject to 
ARa) 2¢ (14.21) 
-Ai(A) 2-6 
A sn 0. 





ae : as 
ê can write the dual of (14.2 1) as foll 


~ ae te 
g R = B T 
3 


= nn 
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- Max cy oy cly 2 
subject to 





Aly) — Aly2) + THAW) =- B 
Yi, Y2 = 0 a 


Taking y2 - yı = y, using linearity of functions AC) and Tr) and the fact that 


Tr(AW) > 0, (14.22) can be rewritten as 
Min cly 
subject to 


Aly) >sr B. 


The above problem is same as the primal SDPP (14.19). S 


Theorem 14.7.2 (Weak Duality Theorem). Let x be feasible for primal SDPp 
(14.19) and A be feasible for the dual SDPP (14.20). Then cx > Tr(BA). 
Proof. Since A(x) — B >sm 0 and A 5m 0, we have, 


(A(x) = B, A) = 0, 
> Tr(BA) > (A(x), A) 
> (x, A%(A)) 
eee, 0). 
Definition 14.7.1 (Duality Gap). Let x be a feasible solution of the primal SDPP 


aed A 4 a feasible solution of the dual SDPP (14.20). The difference between 
Jective values, c°'x—Tr(BA), is called the duality gap between (14.19) and (14.20). 


So far the duality re lt 
TE y results for SDPP follow 


under which the strong duali 


f Example 14.7.1 Wri 
D ERAAN. rite the dual o th ; 
— Nis f te following SDPP (primal) 







= i 7 y IZ. = 
= Subject to 


a z 





7° = “ 
i r_- - 
m D. B 1 
kr , a 
=ð . f 
> S a má s 
Es — ee m 
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nd its d 

° problems. ual Problem, Show that th 

‘ ae ere 
sion Gince x is & positive semi-definite es 

gol ative, thereby jes 4 = 0. Thus the © 


t 
(14.29) gored 


yeoptimel solution is : à ; , XER. 


The corresponding dual SDPP is given by 
siaz — A33 
subject to 


, optimal solutions of the primal SDpp 
La positive duality gap between, the tw å 


min rs f ~ 


CA 
Aga 528i Ans 


EN 
a Fo 0 =y >g 0. 
À13 À23 A33 
| SDPp 
from the constraint of the dual SDPP we infer, -(1 — A33)° 
The feasible set of the dual SDPP is 
| 0 SURE i TaN 
cee | Bets SR Or i Ove) Sete O Arai. 
Wie 23 -1 pice 


/4 > 0, implying A33 = 1. 


Hence the optimal value of the dual objective is —1. 
o | Observe that the duality gap is 1 and not 0 unlike in LPP where there is no duality 


_ gap at the optimal solutions. 
SDPP Inthe problem to follow both the primal SDPP and its dual SDPP have equal optimal 


etween lues but the primal SDPP is not solvable. 


| 

Example 14.7.2 Consider the primal SDPP 
Inf X1 
subject to 


E E f ch Js 0 


ae 









imal values of the primal SDPP 
n optimal solution. 


tees: a: 

Wie th. 3 

ee ihe dual of the primal SDPP. Show that the Zh sess a 
© dual are equal, but the primal SDPP do not Po° 
te dual SDPP is as follows 


—2A12 
7 k 4 


Dene 
annan 


Di 
‘77 T 
Bist J 


_ A 

ry ie 
Aer Th) 
U 





jees S pEr 
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w wield that x; 2 0, xix: > 1 
. primal-dual pair yiel » and ~)2 
he p ‘mal problem and its dual problem are 9 i 20) 


th the pr | 
f bo d of the primal problem is 1/x Which Ying 


the lower boun | 
SDPP is not solvable as its optimal value į p 
8 Not 
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The constraints of t 
Thus the optimal values O 


zero duality gap- However, i 
0 as x2 7 &. Consequently the primal 


ded to depict that even though the primal Sppp is inf 
ed- 


attainable. 
le with finite optimal value. 


The next problem is inclu 
sible the dual SDPP is solvab 
Consider the primal SDPP 


The objective function can be taken as a zero function, i.e. 
X}. A matrix X satisfying the first two constraints of the problem is 


of the form ? But such i iti TEN: 
1 . But such a matrix can not be a positive semi-definite matrix, So 


the SDPP is infeasible. 
The dual of this problem is the following SDPP 
Min 2A2 
subject to 


The set of optimal solutions of the dual SDPP is ( Ban 0 | ALZ | and the 
Pap oe 


optimal value is 0. 


l ension of LPP duality is not possible. It also 
Mal SDPP (14 10) ...3- l ons that ensure d j b tween t ; 
i PP (14.19) and its dy zero duality gap be . 
cept. ee ual SDPP (14.20). The search leads us to the foll 


E TPP (14.19) is said to be strictly feasible f 


aia 
d 


-OTI PT n i T . 0 
ol 


o T 









-ven to justify that the corresponding dual Spp 


xample 14.7.3 Consider the following SDPP (primal) 
Min X1 + X9 
subject to 


Obtain the optimal solutions of the primal SDPP and its dual. Show that the optimal 
= values of both problems coincide. 


Solution The feasibility condition can be reframed as the system of nonlinear inequali- 


ties, x1 > 0, x1 -x > 0, thereby providing the optimal solution of SDPP as E r 2 | 
and the optimal value is —1/4. 
The dual SDPP is given by 
Max Nop 
subject to 


Ay =1 
2A122 =1 

An A12 P 62 0 
Ay A2 : 


- A (| 1/2 yee Clearly the optimal 
The feasible set of the dual SDPP is l| 1/2 A Poc; 
Solution is | aio m and the optimal value is -1/4 aah 
ee T P are strictly feasi 
Te both the primal SDPP and the dual SDP 








e and possess 


SS OPE imal SDPP (14.19) is bounded 





TN yeu-- T hoi em oe the p . 
xs Duality heot!” 1). A ) is solvable and the optimal values 


z TAF O n A 
: awe" E F an an 
* 
TOR 
i 





: Í i ’ 
a ee ee ee 
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i 1O ‘ae i e l 
his theorem ig skipped as 1t uses the i wag theorem’ of ar 
his ; ; i f conn ti tes T ch re s 
The proof of i dge of the ‘separation theorem’ 18 very mu —— to un em 
"`" w t : l r è . . . > ( , , 
sets. So, the re e proof of this theorem. This topic is not covered in the book. Rong e 
the working OF tn : 


a] and Nemrovsk1 [16]. 
— fer to Ben-Tal anc À 
kee hager on noes is a consequence of Theorem 14.7.1 and Theorem 14.7.9 
‘he follow i 


the SDPP (14.19) or (14.20) is bounded and 


P stri 
luable and the optimal values of the two prob] ch 


Remark 14.7.1 Jf at least one of 
CMS ar 


feasible then the two SDPPs are so 
equal. 


14.8 Summary and Additional Notes 


e This Chapter introduces a new class of optimization problems, namely, semi-definite 
programming problems (SDPPs). This class of problems originated from the idea of 
defining partial order using the concept of cone. The related background is build in 
the initial two sections, Sections 14.2-14.3. 

In Section 14.4, we formulated a general model of SDPP via linear matrix Inequality. 

It is shown in Section 14.5 that several classes of problems can be casted as SDPPs, 

thereby, strengthens the significance of the class of SDPPs. 

e Sections 14.6-14.7 are devoted to describe the dual formulation and duality results for 
SDPP. It is noted that the strong duality result in SDPP does not follow naturally 
and requires additional conditions on the feasible sets of the primal-dual pair of 
problems. 

e In recent years, much effort has been undertaken to obtain the strong duality result 
i m e most prominent among them are the contributions of Ramana et 

A vent. Pang [169]. Very recently Jeyakumar [84] presented new necessary and 
sufficient conditions for the strong duality in SDPP 

e Unlike LPP, where the interior point soft d | 
for SDPP it al] started only in mid n; tee Sei 
The codes current] i ree. 
figure is not com : aie p P REDE spià 

patible with that of LPP 


opment begins in the mid eighties, 
In the last decade it has grown manifold, 
e thousand variables SDPP. Although this 
interior point software which can handle 
variables yet we find no reason to be disappointed. With the 
endous theoretical growth and with the rapid advancement 
e significant progress in the coming years 
“http://w for SDPP can be obtained by following ™ 
on esearch arti T chemnitz.de /~helmberg /semidef. htm! 3 
1e excellent b iad gable on SDPP but for the beginne" t 


~ and Nemrovski [16] and Wolkow! d 















D 





FAMEN ith todo [159] , Vandenberghe 


a 
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v emi-defi i 
= | 4s nite Programming 539 
n 
“Se | gh [etc E R" and aeR, a0, Define 
3 f 
i K={xeER": cl 
| OX 2 Oh NER" ,; Te 
l : d ; i ; i & t}, 
tly show’ that K ws Q closed conver cone but it is not pointed, whil 
i l j A, While t 
aTe 42 Discuss which of the following sets are ean e Ky 8 not a cone. 
j) K = {(%1,%2, X3) € R? | x1 > x2 > X3 > 0) ea Pointed conves cones 
i ey Ure R" |x, =x) =... ay, 29 f 
wi) = (x,t) E€ R” x R4 | xx < £}. ^ CET SOME KK sa ene A 
; R={cER® : Ci COX +X +... 4 n-1 
ny CnX > 
ia 14.3 Find the dual cone of the following cones 
ai i) K= (122) ER? : x2 2 kal). 
y ii) he 4%, X2) ER.: A = 0} U {(x1,x2) € R? AON 
= fii) K= {(%1, x2, x3) € R | x1 2 x2 2 x3 2 0}. 
S, fi KEA EST: v Axz 0 YXO 
= | Is any of the above cone a self dual cone? 
ly : 14.4 Show that K* is a closed convex cone. Is K* always pointed? Justify your reason- 
of 
ing. 
It 14.5 Let Kı € K2. Prove that Ky S Ky. 
ot 
2 ; 
d 14.6 Let f(x,y) = A xER, yER+ yF 0. Use the Schur’s complement to write the 
: inequality f(x,y) < t as the linear matrix inequality (LMI). 
e R”. Express the second 
á 14.7 Let A be an mxn matriz, b c R”, ee R”, deR, x P 
order inequality 
1 
e |Ax+bl sexti 
e 
t 48 the linear matrix inequality ( LMI). ; blems 
wey | onlinear programming eee 
148 Write the equivalent SDPP of the f ‘cepa | 
Bele) Min 89 4m tT? 
© b 
, es X41 + %2 x 
Dia < 5, 
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(ii) Min = 


subject to 


fiii) Min Ax, + 3x2 
subject to 


Ko £2 aD 
wie Sl 
Hie 2: 


14.9 Prove that the condition 


—-y-l x 
4 y „ža Ps0 Y y € [0,1] 


is satisfied whenever 


14.10 Write the dual of the following SDPPs 


( 1) Min X1 
subject to 
xı- x 
X2 1 | 2s 0 
xı -2 X? 
X2 2 | 52 0. 
subject to 








1 Xi — X2 0 
‘oes | *1 H X1 X? F 63 0. 
a *2 xı -1 i 


hs eae 


l - 
T á 
a a | ® as 
B Ba e 
e al a 
e ~ a 
= 
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a 


15.1 Introduction 


and searching for the global optimum is a difficult and intractable task. For instance, 
the six-hump camel back function given by 


f(x) = 4x4 — 2.1x5 + ax -xx 408 +425, x €1-5,5] G=1,2), 


. e ji 
has six local min points out of which only two are global min points x7 = (0.09, —0.713)-, 
% = (-0.09, 0.713)? with f(x*) = —1.0316. The Rastrigin function, 


n . 
=, Ap eees nN s 
f(x) =nA + AC — A cos(27xi)), Xi € [-5.12,5.12] € ) 
i=1 | 
‘a havin only one globa 
Where Aisa positive integer. is a highly multi-modal function 8 

MM point at x* = n with f(x*) = 0. in point is guar- 
a nt at “4 | 0 ER with si ( j% i timization problem, a losa few 
lem in hand is a con -o for nonconvex Provle ipl 
he gl min point. Otherwise, a a convex feasible 


K 








t F| i 1A 


T ee 
f gts Vr a] q ee 
1y \ IA \ 
12 Prop 


~ D ae ; 
{ m ne wer A y i a 
JL, PIOL ë 


ee 
a) 
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rking principles of various optimization a] Or} 
y objective of this chapter jg to pr ithim 


dge of the wo : 
The primar 7 
hods that are widely used for global] optimi e; 
m, we would like to address the ; 


ave knowl 
wud limitations. 


f a few popular methods t” 
i , ; al 

` n ro be 1n with oul main 3 : i 

alg arial tt Eads very problems that posed ang challenges to algorithna, 
2 ems. j ~ : m ip 
ec es construct algorithms that can efficiently solve the problems. C 
designers 5 


required to h 
their advantages i 
brief description © 


t 


15.2 How Difficult is the Problem? 


ation is enough to convince us that some problem, 


are harder to solve than others. How difficult can a problem be? Are there Problems 
for which no efficient algorithm 1s possible? These are some questions that SClentists 
have been striving hard to answer. The concept of difficulty is related to the degree of 


sophistication of the algorithm. Roughly speaking, if one can not solve a problem or if i 


takes considerable time to solve it, the problem is termed as difficult. We briefly define 


below how to mathematically measure the difficulty of the problem. 

An algorithmic problem is a mapping that yields a valid solution to the given problem 
or instance. An algorithm is a way to compute the mapping; it is a type of effective 
method which, given an initial state, a list of well-defined instructions for completing 
a task will proceed through a well-defined series of successive states that eventually 
terminate in an end-state. In general, the time taken to do so depends on the individual 
example. For all examples of a given size, we can determine the maximum time taken. 
This is the worst case complexity, and it is obviously a function of problem size. Different 
algorithms may complete the same task with a different set of instructions in less or 
more time or effort than others. The worst case complexity of an algorithmic problem 
is the time taken by the fastest algorithm that can solve the problem, as a function of 
the problem size N. For example, suppose we have three algorithms that can solve à 
given problem. The first one takes time that is a linear function of N, the second take 
time quadratic in N, and the third takes time that is exponential in N. The worst cas 
complexity of the problem is then linear. 

S oe oo a R that the actual time taken to solve an optimization problem e 
factors constitute N i e quality of the code, and many such factors. ss 
implementation details. does 10 
depend on implementation details, the not toti 
Gnalysts looks only at how the complexity 
that an algorithm takes 10N 2 


ay es Basra fen 
l aleorithm wil] +. 
gorithm will t 


o 
f 


A little experience with optimiz 


In order to allow a comparison that 

ion of asymptotic analysis is used. Asy™? 

grows with N, for sufficiently large N. Supp 
e ETENA ee T25 == of time. We can see that a 
re implementation dependent. į fase by 10.1N . Since the actual constants p 

an = = = A viens as meaningful to examine how the timet + B 

wore coc oe problem sizes. It is clear that if another 88% yA 

a OT targe N, then it would be slower than a „hil 

rst case complexity for algorithm A iS oN 
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Sand 


pic O(N®). This notati Global Optimis.«: 
B is ( ) lotation e Iptimization 537 


for l limir 
kS sit It implies that ‘ates the need for impi 
, a | positive integer No and a real y P cementation dependent 
On q a KN? \ humber K n 1 i 
, of | algorithm A < 7 Y N > No. Such that the worst 
i — o “v Case Tun ti 
ease complexity of ; ime 
ik The worst cas plexity of an algorithmic problem ig o of 
IN 18 cle 


i} : ; 
arly the smallest or min- 


ss all algorithms available fn. 
ei across = available for the task The 
Ate WOrst case p 
é neasure has sorme 
some 


Tt is possible that the aver 
backs i i i J a average case behavior n I 
s asure is overly pessimisti i lay be mus 
rst case meas tl + í pessimistic. However in mar l 0% he ee iii 
easure that he ; any domains it stil] « 
A valuable m é p us understand how difficult s ains it still serves as 
„omparison to others. Some problems can be. in 










i y Bigor pn “i tee 8S eee polynomial time one if its 
Sts complexity is O(N ) for some non-negative fixed real numbe k Noted i 
> depend on N. A polynomial time algorithmic problem is the a 7 ote that k does not 
i i : i a e for which a polynomial 
rit gme algorithm is available for solving it. The term feasible is also used por 
ke lynomial, since for large N, the time taken by a non-polynomial time nat 
As a i class of all problems that have feasible or polynomial 
' time algorithms is denoted by ‘P’. 
7" On the other hand, an important class of algorithmic problems are decision prob- 
he lems, where the solution is simply a binary one, that can be treated as a yes/no, or 
lly true/false decision. For instance, consider a graph G = (V,E) defined by V nodes and 
el their adjacencies. The graph coloring problem involves assigning a label to each node 
~ such that no two adjacent nodes have the same color or label. The optimal coloring 
ùt problem is to find an assignment with the minimum number of colors, whereas finding 
Dr whether the given graph can have a coloring in 5 colors or less is a decision problem. 
ii Many decision problems require us to use randomized algorithms to solve them. A 
af non-deterministic algorithm 1s a randomized algorithm which employs some random- 
x ization as part of its logic. The set of all decision problems that have polynomial time 
b non-deterministic algorithms is called ‘NP’. For example, ge layers aS E 
m VLSI, scheduling problems, and a number of tasks in circuit design can be e 
i : i All these tasks are 1 class NP, which in 
to versions of the optimal coloring PIO Oa me 1 domains of engineering. 
is cludes a very large number of such problems arising 1n ioe problem in NP can be 
se Some of the algorithmic problems share a property oe e aet and are termed 
t reduced to them in polynomial time. These AA cd include optimal coloring of 
ic ® NE- complete’ problems. Examples gt si barge! problem. For many algo- 
-Staphs, the traveling salesman problem, and t 4 d outputs, the threshold problem 
involving complicate Jems. We here skip the 


‘ i si mic OL ti T ization problems, ‘NP-hard’ prob 


>-complete. Such problems are termed ets 
CC 3 i O > 
t the readers can refer to [125] ig it hard to obtal p Dal optimum. 
on Tae ro een lod eS Q O 
at for several Prov. ems, fae e are to the 8! MEIOS 
B aaan how “lose T ms are non-deterministe 
yisco nara tY Se A ara TER 
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thms with a clairvoyant random number generator. While su 
SUC 


tical to implement, it indicates that randomized algorithms - algo, 
the only feasible way tO search for good solutions to such hard problems, nay he 
The above discussion On algorithmic complexity indicates that determinist 
optimal solutions may take a very lor IC 4 
16 tim 
2 


es used for finding the globally | l 7 
at involve several variables. Most 9] 
8 obal opt 


particularly for real world problems th | 
that are widely applied therefore involve some randomized 


mization methods 

steps, or are ‘heuristic’ in natu 
Algorithms that can be use 

wide set of different optimization 


metaheuristics include ‘simulated annea 
tion’, ‘particle swarm optimization’, and ‘tabu search’, to name a few. The id 
l ea is t 

0 


leave the best solution untouched while allowing other states to explore the sear h 
These approaches also try to combine features of several solutions in an atte “7 $Pace 
the best solution. The use of metaheuristics has seen a significant increase i a to find 
because of their ability to find high quality solutions to otherwise hard E ast decade 
optimization problems. In the sequel to follow, we describe some of these B 
uristics, 


randomized algori 


rithms are imprac 


re. 


d to define heuristic methods, and are applicab| 
problems are called ‘metaheuristics’ Bran i a 
l ples of 


ling’, ‘genetic algorithms’, ‘ant colony Optimi 
iza- 


15.3 Simulated Annealing 


Simulated Annealin - 
g (SA) is a randomized . 

ean theornc | search technique in numeri ee. 
ae on =x o of thermodynamics. The idea of SA comes en optimization 
dices a is al. [115] in 1953. Annealing, in metallur paper published 
Ive heating followed by a controlled cooling of = 1S 3 ae 

metal. At hig 
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Fig. 15.1. 





-sorously, and can collectively move to a much hi iei 
emperature is reduced, the shaking reduces, an P i- Cnergy State, such as My. As the 


d the search beco 
. m 
moves to higher energy states become less likely. In time, the ten H 
8) ate 
Py 


The SA algorithm involves generating a new state and accepting it by applying an 
aceptance criterion. New states are usually generated by applying a set of ‘moves? or 


transformation rules to the present state. Generated states may have a cost or energy 
that is occasionally higher than the present state. 
the acceptance criterion accepts the new state, i.e. a transition is made to it from 
the present one, if its energy or cost is higher than that of the present state, it is 
accepted with some probability. The probability of acceptance depends on the difference 
| in energies as well as a global parameter T, called temperature. 


f(x) 


accept it with probability 


52 


mo 







eo 
always accept it alobal min point 


kac.. Fig. 15-2. 
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rent solution (state) of an Optimization 
: to 15.2. suppose the ane r nE , Prob} 
Referring to F g tate is $1, then it is accepted; but if the new State is s> it i Be 
is x”. If the new j ain probability. The probability of accepting a worse state is hy nly 
accepted with r ecreases as the temperature decreases. The ee Probabili 
a DR of obtaining the new state with cost (energy) f(x )) relative tot 
sed on the | en : e 
E state with cost (energy ) f (x), and is described by the Boltzmann Probabitiy 
distribution function, i.e. 






exp(—f(x“*)/T) 
exp(—f(x**))/bT) + exp(—f(x™)/T) 
1 
T+ exp((4f)x/bT) 
= exp((-Af),/bT), 

where (Af), represents the difference between the present and previous values of the 
costs (energies), i.e. (Af), = f(x**)) — eae), T is the temperature, and b is the Boltz- 
mann’s constant that is used to normalize the energy function values. For practical 
purposes, b is taken to be 1. | 

The search for the minimum is initiated at a random feasible point x), Set Leia 
xO, fmin = f(x). We start with an initial state of temperature T = T° which is set to 
a high level. 
The change in a state from the present state to the candidate new state is accepted 


p((A fk) 





if low 
(i) (Af) < 0, i.e. the function value is decreased. This forces the system towards a a 
state corresponding to a local or a possibly global min point. bs 
> (A BO 
(ii) If (Af), > 0, but p(Af);) = exp! a > r, where r e (0,1) is a randomly opt 


generated number. Then, an increase in the ob jective function is accepted with a certain 
probability in order to get out of a local min point of f. | 


It is easy to see that initially, as T is large, the probability of acceptance is high 


cane any change in cost. In other words, initially all candidate states tend to be 


: The SA procedure can be described as follows. 
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m : i e mo ef Jg : ae p 
a Mize with several constraints. They are foun 





l : Xmin = x0) 
Set objective function (energy) 














(stopping criteria) 3 
(KAI) a ad | 
x = neighbour((*)) 


k+ 
If elk pe et emin, then 


= (K+ 


else 
if p((Ae)x) > random(-), then 
Xmin = xX*D) en = olktl). 
else 
Xmin = x”, emnin = elk). 


Decrease the temperature T. 
Set k =k +1, and go to (15.1). 
return Xmin- 


Here, kmax is the maximum number of iterations allowed, and ey is the pre-targeted 
lower bound on the objective function f(x). Also, the temperature is usually lowered in 
ageometric progression, 1.e. THD = p(T)T™). Here, B(T™) is a function that indicates 
the rate of change, and it is termed as the cooling schedule. The choice of function 
§(:) is critical for the efficiency of the algorithm, and much efforts has been devoted to 
optimizing it for some commercial applications. 

We encourage the readers to implement the complete SA algorithm and test run it 

~ tofind the global optimal solutions to some benchmark problems. 

One needs to realize that the cooling schedule is critical for the success of the algo- 
| rithm. It has been shown that with a logarithmic cooling schedule, SA, asymptotically 
| onverges to the optimal solution. Some variants, using ot 
tions, have been proposed to make the convergence faster. 
. sealing’ uses the Cauchy probability ange ae onlinear optimization 
jà + At the same time, SA has 


n SA techniques, and their 


her types of distribution func- 
For example, ‘fast simulated 


Cy 


SÀ algorithms have successfully been use 
d to be robus 











Hrtan Ne fe i 7 lved 1 . 
ee Several ype E aaa 2 : as final solution. Another crucial 

~ ng can have a signi t upon the quality of th" / e prohibitively 

wor is ti O AN as drawbacks of SA is that 1t can be p 

> ume. In fact one of the ma) 
| dimensional optimization problems. 
a! 
ae 
T b > 
k: 
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15.2 that ce of the initial point can have a large 
pos ‘aal extension of this observation is to co Carin 
ney of search. A logicé Mmenon 8 
on the efficiency °° * g jg known as restart. Alternatively on e 
7 e Co 


-aorent initial locations - thi 
search from different 1m Petre f yltistart approach. Since the dj ul 
begin from several different locations in a mu PP he different lo y 


optima can probably capture @ few different conan’ S ae se by combing, 
the features of different local optima, it 15 natural to believe + K e search fortil a 
optimum will become faster. This approach has been a, ully used T evolution, 

optimization methods’. Genetic algorithms are a widely use technique from this AA 


with Applications 


i » choi 
It is clear from Fi the 


15.4 Genetic Algorithms 

find their motivation from Charles Darwin’s famous thea 
bodies a widely held notion that all life are related a 
cestor. GAs in particular became popular through the 


Genetic algorithms (GA) 
of natural selection. This em 
has descended from a common an 


work of John Holland [77] in early 70's. 
The algorithm begins with a set of solutions called a population. Solutions from one 


population are taken to form the new one with the hope that the new population will 
be better than its ancestor one. The algorithm thus tries to simulate the evolution of 
a species in which a population of individuals, over successive generations, optimizes 
a set of traits or characteristics that maximizes the fitness of individuals in a given 
environment. This is achieved by assigning a numeric fitness value to each individual 
in the population. For each generation individuals are selected from population for 
reproduction’, ‘crossover’ and ‘mutation’, to give birth to new individuals. Selection 
o the next generation population is entirely based upon the fittest individuals from 
eoem ve newly formed offspring generations. The idea is that offspring of fit 
p an wo | inherit good traits or features of their parents and some of the offspring 
would inherit a better combination of traits making th 
_ ie ng them even fitter. This strategy 
ptimization method to explore the search 
of the individuals is expected to impro a aoler wake ke 
E ae soal a o prove over time and the best individual is chosen 4 
’ . š 
GA’s use two basic processes: 


(i) passing over feat 
7 ures from one i 
EM eitevival of the Sitesi generation to the next generation: 


We now explai 

bet N plain the termi 

r Ser ulation. nology and the procedure associated with a basic GA- 
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are mathematically coded as a string, with each character © 


leioana. 

OM SCLC. Ea h 

si ee ` Sene encodes a trait, for example, th 
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Sek determ ining the fitness of 
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i i Evoluti 
i l onary Methods n 
pace. The set of all feasip] ndG 


| lob | 
Spac © 80lutions ne al Optimiza l 
; | h tion 
jing Schemes. In order to apply GA: AE given pro} 543 
to encode the chromosome whe: 5 to Solve Pro lem, 


a given opti 
š “4 : ` ” á ODtir P P E 
ane information about the a y. In far nizatie 
ntain about the solution which it 4 * the chromosomes, problem, 
. re 


af the lar and wide i 
ost popular and widely use 
geal E is a string : : ‘Sed way of encoding is pinn Problem. 
chromos String ot bits (0 or 1), hy Faerie 
n encoded chrome ae coding, 


some might look like 


OG Be La de 


j a a he string 11111 represents yU. The followi 
ade other values of x in (x", xU) as a substring s 


—— 


x= xb + (decoded value of s) Ex - a 
5.2) 


g. Then x = 0 is coded as 00000 and 
s decoded value is 1 + 20 +1+2! 404 


of x is calculated using (15.2), and it 


— 


Suppose x € 10, 7], and 5-bits are used for codin 
yen is coded as 11111. Now, if s = 01011 then it 
41+2°+0+ P= 11. The corresponding value 


is equal to 0 + 11 [= = 1.1148. 
‘Permutation encoding’ is another popular coding scheme that have been used in 
ordering problems, such as traveling salesman problem or task ordering problem. In 
permutation encoding, every chromosome is a string of numbers in a sequence. The 
encoded chromosome looks like, 153264798. This encoding is useful only for ordering 
problems. 

There are some other useful encoding schemes, like, ‘value encoding’ and ‘tree en- 
coding’. These schemes have been successfully used in some very specific optimization 
problems. 

Fitness Function. A fitness fu 

_ function f(x). For maximization problem, 
objective function itself, i.e. F(x) = f(%), a 


fitness function is often taken as F (x) = 14+ f(x)’ 


nction F(x) is generally derived from the objective 
the fitness function is usually taken as the 
as for the minimization problem, the 


5-bits coding 
mE pat x), x € [0, r]. Suppose ae ania 
| a instance, suppose we want to aae i Saint The corresponding point is x = 


_ ‘used, and one of the coded chromosome is $ = equal to F(t) = cos(1.1148) = 
“1148 (described above). The fitness value of x (or S) 18 €4 











omosomes. The 
ulation si then thane will 
] populate ize is too large, 
while if the al experiments 

Ough exploration of the search space; e Seat 


| he initial population of chr 
: ze is chosen, 


_ ~ next step in GA involves selecting 
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s. a population size of 50 = 100 has been shown bi 
al 


tk 


for some problen 
problems. but 
well 


4] populatio 
no dearth of po 


| . generated, the algorithm evoly 

n 1s randomly gel nv avieie ai = thro 
ssible operators. However, commonly e 

‘mutation’, and ‘inversion’, 


After the initi 


è sro is 

operators. There iy es 

GA cenlartion or reproduction , Crosser : a 
ones are ‘selection ¢ l sod strings in a population and forms a matin 

The ‘selection operator selects good St ) A 


‘nos are picked from the current = Pool 
The basic idea is that the above-average strings are Pee nabs Population is 
| basic de heat s are inserted in the mating pool with a probability proportio F 
and their multiple copies a i ie 
to its fitness. If n is the population size, 


: . ý ne 
then the probability for selecting the i-th Al 
is F, 


ring 
SA ee Fj (153 


$ 
ADloyej 


The string with a higher fitness value has a higher probability of being copied in the 


mating pool. 


Example 15.4.1 Generate a mating pool for the problem: Max 3x — x* over [0, 3]. 


Solution Let 5-bits binary coding scheme is used, and suppose the population size js 
n = 4 (this is only for illustration purpose), and it is constituted of {(01001), (10100), 
(00001), (11010)}. The fitness function is taken as the objective function itself, i.e. F(x) = 
3x —x°. The population after selection, constituting the mating pool, is depicted in the 
following table. The column ps is computed using (15.3). 


Peay SAS, Ou eS. O 


9 


20 


26 





The ‘crossover operator’ is the main operator. It combines a part of the chromosome 
of one parent with a part of the chromosome of the other parent. A simple way is to spit 
ai parents chromosome at a cut point and choose the left half of the chromosome of the 

= Ered z in second half of the chromosome of the other parent. For example 
en we s i eie, sits a “leans A a a el i ie 
ae gene of the two chromosome 
4 ies Ses 1 andl (Offspr ing 2, are produced as a result of crossovel 
~ ne | g on ihe 1 eft part of Chromosome 1 and the right p 
iti ice aa 18 f results from the combination. of the left par 
-nüromosome 1, 


a” si 
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N 110 
to wor, |00100; Chromosom 
thro ring 1: 100100, è: 10 111 
Ugh Offspring + 011000: O ~— | 41000 
™Ployeg spring 2 
aii: 11000109 
28 Poo} qhe above SiR ation of a one point cro Seco 
WA, frequently used in the GA algorithms neta However the ¢ 
iis: S ‘tes selected afte > ine following m 22 Point 
Ttiona) qossovel sites S er the third and fifth genes owing illustrates the habs se 
Chromosome a hdQ | 00 | 110; ¢ 
(15.3 =a i hromosome Fa aria 
3) = pee : W| un | on 
ring 1: 1011110; na pea 
IN the Ofisprins Offspring E 11000011 
be appli So 
Crossover May be applied many times to sep . 
3]. parents. The ratio of the number of offspring ae i he offsprings from each set 
; F O > m - 
size is sossover rate or crossover probability. If the crossover z eg size is termed as the 
0100) ofspring in the next generation are obtained by crossover att . /o, then all the 
i as aea ; ‘he crossove 
(x) = bility is 0%, it indicates that the whole new generation is formed from ee, one 
in the copies of chromosomes of the old population. Therefore, a positive crossover probability 
is desirable so that the chromosomes in the next generation contain good traits of the 
parents chromosomes, with some additional new features. At the same time. it is good 
to have a crossover probability less than 100%, so that some fit parents survive to the 
next generation, and good traits are retained in a stable fashion. 
After the crossover is performed, we modify the population in the next generation 

_ byreplacing the parents in the older population by their offsprings. The population size 

 ismaintained a constant throughout the algorithm. 

' Next, we apply the ‘mutation’ operation. In nature, mutation is refereed to the 
some random changes in the genetic constitution of cells. It is considered to be sran 
split thegeneration of new species, new traits, and often the source of ee pe 
f the __ lave allowed a species to survive hostile changes in their environments. 4 15 be spa: 

1 mutation ; - bacterial species developing resistance to ant cs. 
aple, lon is responsible for many bacte - chaneing some of the string 
and | In genetic algorithms, mutation is simulated by randomly changing 5%" i.e 

elem 3 l -oying this is by randomly interchanging gens ~~ 

Two = ents. The simplest way of achieving t™ its from 1 to 0 or from 0 to 1. For 

ver. for binary coding, switch a few randomly ee “autated offspring can be 010111, 

part _ Sample, if the original offspring is 110110, then 0 to 1 and the sixth gene irom. i 
o Beards hi: +: the first gene from 

| _ ~A IS obtained after switching 

OS ts some genes may di 3 

ep- fragments from parents: ver. Mutation 

__ ce crossover combines chromosome erations of t get 

at from the a nts after many 5°" the search does not 8 

she Chromosomes of the haa and thus ensu = | 
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For instance, suppose that all the chromosomes be 

ne code 0, while the global min point has the first < Wren, 
apply the crossover operation alone, the algorith, Code 
o with the first gene code as 1. However, the will 


the mutation operator in a single iteration es 
the 


n local minima. 
have the first gene 
as 1. No matter how long we 
not be able to produce a chromosom 
can be obtained through the use of 


fas clea I rate or mutation probability is the ratio of the number of random n 
changes to the total number of genes in a population. Generally, mutation probabil 
is low, otherwise the algorithm becomes more of a random search. y 

There is no general theory available to tune the crossover and the mutation Probabij 
ties. Empirical studies suggest that the crossover probability should be high, someih i 
between 80% — 95%, while the mutation probability should be low, about 0.5% — 1% ¥ 

Another operator, namely the Gnversion’ operator, involves taking a random val 
string from an offspring and inverting it end-to-end. Inversion is not applicable jp all 
scenarios. It is useful when the chromosomal representation depends only on the set of 
genes and not on their sequence, i.e. when the position of a gene is not important but 
only whether it is present or not. Inversion is rarely used in applications and its benefit, 
are somewhat unclear. 

The GA procedure is summarizes as follows. 

Step 1. Choose a coding to represent problem parameters; a selection operator: a 
en: a mutation operator; population size; crossover probability; mato 

ility; i 

otc pete ac ane 

Stepias IEK > kmax, Sane the ree Se ee 

Step 4. i 

E ey tore, ereat tation 

Step 6. Determine the next generation aay ot ae 
chromosomes or a mix of individuals from th i EA ee pond of nenia M 
The fitness value of a chromosome is used i e a ad the newly ae 
Generally speaking, it is desirable not ms a ot entire pop 
some diversity in the population so th i ts “el the fittest individuals but to have 

at exploration of the search space continues. 


trapped i 
population 
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ion of the fir, io Same amount of computation time 
} n ness function is an important factor 


rf Y 

- y E 

` wa 
\ 


“al at ectively solve problems in ¥ 
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arbi 
GAs fail to adapt to sudde 


de oe changes in the problem, We me; 
ril] os 


9 d Ne 





m where 8 duri 
: l | es durin 
ìc evolution process han aN y it K 
he an _ the genetic ¢ TON process becomes Slow and May t ey. ve time, wo 
a} ` ais AS ` r D ihit; k “a S j 
gently In such cases, GAs involve Prohibitive amount of co es many Senerations in 
i OMputatione 
5 pactio? ns, 
y Ant Colony Optimization 
| 15.5 
t~ are other examples in nature from 
e 


where one 
are good examples. 
ization? Bees not only manage to find food 


can draw ins 


sect colonies 
k search. Insect colonie Hos do MeT 


piration for efficient 
insects achieve self- 
sources efficiently by employing a 
aot ed search, but also employ elaborate signalli 


l l ng schemes (a kind of intelligence 
f tem) to help other members of the colony relocate the same sour 
| * ow known to use chemical signalling to help de 
. i! sources. An easy to observe Species is the c 
fr food lays down a trail of a chemical called a pheromone, as it forages. Pheromones 
ye chemicals that trigger a natural behavioral response In another member of the same 
ies. Other ants are more likely to follow trails which have more pheromone eo i 
-e h communication with the environment helps the ants to achieve a ines 
ization in their behavior. This process of engineered self organiza ion 
O i he Greek words stigma and ergon meaning sign 
eee ery ~ derived ig e j m t form of stimulation between two indi- 
and action, respectively. Stigmergy is an Paes ad testi 
O Oe of them p i neal or wasps building their nest, or 
elvironment at a later time. Examp . 
ants following a trail. l cess of stigmergy. These algorithms 
Ant colony rns alterpEse i eee oriifciah aka ee 
ees! stigmersy, as a ie the most popular ant colony ce eee 
ato e202 problems. fe ig used to solve discrete ah a: awarded the 
"colony optimization (ACO), whic fate Dorigo in the 90 s, ( a in 2003 for his 
The ACO algorithm was introduced by by the European Commiss! 
Marie Curie research excellence award by 


inspired by 
algorithm and its variants are MSp 
~ isti ACO algor! 
 tttibutions to metaheuristics). The their colony nes 
F he be 


t to food. às 
ce, W 
A a 5 . aths from i n w aa 
~ 2ehavior of ants in finding p 
Vonsider 


ommon garden ant. An ant looking 


i a eed oe 
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mization with 


more pheromone 


F Food 
, ~ A 
SS ap Food ` Shortest Path 
Longer path 
Á— 
Nest 





—> Longer path 


Fig. 15.3. 


While walking down the paths, ants deposit pheromone. The following ants prefer paths 
that have more pheromone. This leads ants to choose the path that has more pheromone 
deposit. Ants that reach the food source early, using the shorter path, return to the nest 
earlier. Therefore pheromone starts to accumulate faster on the shorter path. Eventually, 
the pheromone trail laying mechanism induces a positive feedback that rapidly increases 
ants’ bias towards the shorter route. After some time, it is observed, that nearly all the 
ants prefer the shorter path that also has an increasing amount of pheromone. In this 
model, ants deposit pheromone both on their forward path from the nest to the food 
source, and the return path to the nest. It was experimentally observed that if ít is 
assumed that the ants deposit pheromone one way, either only on their forward trip or 
only on their return trip, then the ant colony is unable to choose the shortest path. 
In real-life scenario, one can easily relate the ACO algorithm with finding the shortest 
path between two nodes in an undirected graph. Artificial ants (replica of real ants for 
problem solving) are used to trace the shortest path or the minimum cost path in the 
ei r by apain the behavior of real ants. However the ants quickly fail because 
nee i = k ie The looping phenomenon is easy to explain. Once an allt 
| rts reinforcing the pheromone trail rapidly and the loop becom 
follow. One measure to overcome this hurdle is to 4 
“A : thm only in the return mode, and completely rem 
Pp: mode, i.e. assuming that the ants d one 
J] m o not leave the trail of phero™ 
ins : fails toi conware y only on their return journey. pi 
introduc TEM ks a This fact has also been mentioned id 
ti " i: s S “Cations in the behavior of artificial ants: 
erend a limited form of memory’ to store P a 
and the lir Sige sts. The memory is ©” 


aay 


the pheromone update in the algori 
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Ee facial ants to (i) construct proba 

l ge the n had constructed in a determinist; ard paths: /: 

pat He oss raluati Ee 1C Way on =a 
‘ns; (iii) evaluation of solution quality return journey path 
jd ther factor that has been inco ef Y along with loop 
of ration of the pheromone. This e ant colony al 
rate that they had already visited. and allows S to avoid r 


oe bet of vertices and E number of ares or edges, sey: ae 
if there exists an ] 

ki Xist arc (i,j) € ET 
e-weights defined as follows, 2 


È if there is 

5 an o : r 

7. ) 0 i edge from node } to node i, 
; otherwise. 


The amount of pheromone on the arc between nodes i and j is denoted by qij We 
Bis assume that the ants do not deposit any pheromone while moving in the forward path. 
Let the number of ants be N; let the source (nest) node be denoted by S € V and 
he destination (food) node be denoted by D € V. Also, we shall be using pred(k,i) to 
lly, denote the node visited by the ant k just before visiting the node i. 
The algorithm is designed so that the amount of pheromone on a path is a monotonic 











ses 
the fmction of the goodness of that path. The goodness of the path is estimated by the 
his = fraging behavior of the ants. 
od The general framework of the algorithm may be summarized as follows. 
is | Stepl. (Initialization) Set qij = qo, V i, j. This ensures that all edges in G have an 
or equal non-zero amount of pheromone on them. “et 

| Step 2. (Forward Path Construction / Forward Mode) lg bye ` Seti 
est ant is currently at the ith node in G. The forward path, ye os a aa 
for | probabilistically choosing the next node to move to among those t p BRE 
he | hood of the current node i in G. The probabilistic a. a eo oe had traversed 
ise pheromone previously deposited on G by the earlier ants ( 7 a pame 
nt those nodes during their return journey. Remember, ants if are computed as follows. 
es deposit any pheromone on G. The probabilities ee 
do For each ant k = 1,...,N. , 
ve ei p; For each node i = 1,.--, V, -aø to the next node J as 

= ie compute the probability of an ant mo 
35 È E 
3 k; 7 Se ij ; É ed(k, 1), 
Ta iTV qs JEP" 
>> aoe ahfl m = =| zo n 
TAi C i,j) | k otherwise, 
0, i one level 4ij- 





b i ] | 
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= d nE J ed 
m ele. of the phan 
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E node based on the above pr en sr: that if the nog 
Move each ant to © ae af node i in G, then ej = 0, aa i mi p ( DET € 
j is not ina ey ae graph G is a weighted graph with edge - weights Present: 
i g 


rt llows. 
z= eis aaah the probabilities are computed as follow 
the distance Aij, 


P 
a 
eye iF predl, i), 
prov(k, i, j) = Fy 94 Nij : 
0 otherwise, 


here nj; is the heuristic information denoting the desirability of the are (i, Nan 4 
W t] 


1 i ee 
it is typically taken as 7i also f is a parameter to control the influence of Nij- 
P 


The ant memorizes the nodes it had visited and the costs of the edges (edge lengths 
dij) traversed. The cost of the solution generated by the ant can thus be evaluated and 
stored in the memory. This memory is used later during the return Journey. 

Step 3. (Return / Backtrace Mode with pheromone update) The return journey 
follows the deterministic rules that are described in the following (i)-(iii) points. 

(i) Once the r-th ant reaches node D, it switches its gear from the forward mode to 
the return mode. It retraced the same path that it had build in its forward journey, step 
by step, till it reaches the starting node S. Remember the ant had already memorized 
the forward path. An additional feature of the return journey is the elimination of any 

loop formed during the forward phase. 


th ant deposits some amount of pheromone 
the amount of pheromone deposited on the 
fij = qij + Aq". However, the cost of the solution (also 
called a quality of the Solution) generated in the forward journey of the r-th ant can 
creasing function of the path length, and that can be 
function represents an appropriate choice 








aw, aa D 1enome na Similar to , cs Eais z paths between the nodes 5 and D im G. hm 
one by alloy etting of ~ Pheromone is added in the algorit À 
“oces done in the past. The algorithm 
ASt Mistakes Thi- _. f ‘artiDC 
a,  ~4IS way an element 0 5 
ine e re) n a ro Pe An . S, in turns, 
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pd ated by applying the following M 


over the pe 


| “od Of time m 
evapo" ile ime, The Pheromone trail 
4 =~ py ai, Y@jex 
€ (0,1] is an evaporation rate constant 
«here P d procedure in S- i 2 memory loss, 
gae pea d it th AO deatg first apply the e i 
or let the ants depos} e pheromone on their return tri vaporation rule, and there- 
| rip. 


altel variants of S-ACO have been Propose 
to make the search faster, and to avoid sol 
ns. ACO algorithms have been tested on | 
nomenal successful. The readers are re 
timization, some of which is mentione 


d t 
ioe RIS from the local min 
at Nave been found in 
earlier 
Hs number of problems and are found 
Bre to the wealth of literature on ant 
In the Summary of the chapter. 


int, 
eratio 
to be phe 
colony OP 


5,6 Summary and Additional Notes 


, The search for the global optimum is akin to the search for the truth or the holy 

grail, and there are as many paths to it as the number of faiths. But, like in real life, 

every path consists of some smooth portions that can be seen as advantages, and 
many hurdles that can be visualize as drawbacks. One has to thrive hard to achieve 
success. The success, many a times, is goal dependent, and there are no shortcuts to 
it. This can be interpreted as the choice of an appropriate technique to obtain the 
global optimal solution may depends on the specific problem in hand. The focus in 
this chapter has been to familiarize the readers with some widely applied techniques. 
After briefly describing the meaning of difficult optimization problems in Section 
15.2, we discussed the working of the three popular heuristics in Sections 15.3 to 
15.5. 

One of the major advantages of these 

other words, it means, that these algorit 
mization problems, which involve highly 
algorithms have been applied to succes 
optimization problems, and problems re 


ey selection, neural network, ache hine learning, to name a few. There 
mobile communications infrastructure, and machin 


ficiency 
Pas h as robustness, € 
taheuristics, SUC ually close 
are man ‘ont features of these me d solutions, us 
- O They generally yield extremely 800 


heuristics is their gradient-free nature. In 
hms can be applied to a wide class of opti- 
non-smooth functions. Consequently na 
sfully solve many hard core one 
lating to multiobjective a a> 
routing in networks, molecular str ; 








S S T ter. 
= E 0 the global optimum. ques described in the a 
Bens: hni ingber.co 
velista few links relevant to the tec http:/ [WwW jngber.c 
For SA, refer to Lester Ingbers homepage - in rogram - 
l ki -TA ba n Ta d a Matlab Interfac ’ urce MATLAB p 
ech code in C and ; open-50 


ing Algorithm» * 


i nnealin 
a a é est Ce. C k L1? 
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matlabcentral/ fileexchange/; or 
es /64/pagel.html; and 
ovide open-source SA files, — 


552 Numerical 
http: /eww.mathworks.com/ | 
ed y www.heatonresearch.com /articl 
http:/ /paradiseo.gforge-inria.ft/ also pr 

ntations of GA, 

- http:// geneura.Ugr.es 

et.mit.edu/ga/; 


Í refer to: 
amcor /-jmerelo /EO. html; 


GAlib - http://lance | 
GOAL - http://www.geocities.com / 


GAGS - http:// kal-el.ugr.es/GAGS/; 
Genetic Algorithm and Direct Search 


The best source of ACO implementations is: 
http:// www.aco-metaheuristic.org / aco-code/public-software.html. 


Another interesting site - 
http: //www.codeproject.com /KB/recipes/ GeneticandAntAlgorithms.aspx 


contains programs for solving the traveling salesman problem by ACO | 
MATLAB. oo 
The years after 1990 have witnessed tremendous 
7 growth in metaheurist; 
rithms. Besides the three metaheuristics described in this chapter er 
other metaheuristics are available in literature. Some of the popular ones a r j 
ki swarm optimization’ (PSO), developed by J. Kennedy and R. C ie 
f p / aypar nin ); ‘tabu search’ attributed to F. Glover (http //spot 
“a ented g“ erg ee mimicking the improvisation hee 
nie ://www.hydroteq.com); ‘greedy random 
A (GRASP) by T. A. Feo and M. G. C. emea e, adaptive search proce- 
featine “einai es pea web links, there are several excellent books 
ec isti 
rar EPOE i inc metaheuristics. The reader may like to refer to the 
nd Aarts [99] for SA; M. Mitch : 
Stitzle [51] for ACO; M. Clerc [38] for PS ; M. Mitchell [116] for GA; Dorigo and 
_ and the text by X. Yang [170]. r PSO; Glover and Laguna [69] for tabu search 
e Innumerable variants and hybrids of th 
more applications of metaheuristi ese techniques have been proposed, and ma?! 
ics have been reported. This is a field of active 


= research, with 
e a large communi 
applications, unity of researchers and users, and a wide rang? : 


geneticoptimization/; 


Toolbox in MATLAB. 
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16.1 Introduction 


The term Machine Learning refers to learnin 
ig she form of images, measurements, observatio 
of any machine learning algorithm is to pe 


g from empirical data which could be in 
ns, patterns, or records. 


The ultimate goal 
rform well on the train 


j ood performance on fut ing data and also 
ST to ensure ag p i ure, previously unseen data. Therefore, learning 
5a in this context means an inductive process where one observes examples that repre- 
sent incomplete information about some statistical phenomenon. Three basic problems 
‘ that are often studied in machine learning are classification, clustering and regression. 
i Although there are several approaches available in the machine learning literature to 
: address these basic problems, we concentrate here on the mathematical programming 
l approach only. This approach has been initiated by Mangasarian [110] and has now 

become very popular in the machine learning community as well as in the mathematical 
j programming community. There is a very close connection between machine learning 
e 


and mathematical programming, because most of the models for classification, regres- 
d i ; uadratic programming or certain 
| Sion and clustering result into linear programming, quadr ENE 
= Specialized convex programming problems e.g. semi-definite ee ‘odie prolate 
_ oder cone programming. An added advantage of phe a E l 
F i ae t that it gives a i i : 
. va | don programming A e theory) and efficient a pee in 
 «“Uesults (e.g, itions an t hand. 
| ee fi. eee so as to use them for the specific a “ei variety of 
| Machine Di sachet haye peen ae ar network intrusion de- 
at i0-i D ter vision, financial forecasting, © s can broadly be 
"aS, e.g. bio-informatics, compute i ; hnique 
j tion, Spam categorization and text categor 


i j he 
d learning. T 

ie ced learning an whereas 
| “sified into two cl mely, supervise ise 

ed into two classes, na the superv 
=. vale E, B. 3 > ; belong to 

“ems of classification and inne th a label. If the labels 
tering is ir | the class of unsuperv1se ciated wit the problem 









zation. These tec 
supe 
d learning class 


js assO rwise 
i onlan~. ant a a ry sample othe 
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roal valued labe x 
vin ting the label for newer samples, Hence “AMD eg 


een samples and their labels i “Thing 
ý Ut al 
S0 


ls. Based on the labeled trainin 


becomes a regression one fo 
one is particularly interested 
is not only a question of finding : 

lizati seen samples. | | 
of generalization to unseel Į APA. Eo 


ate introduct 
A usan té & vory brief and in P tic 

This chapter presents : (i.e. two class) data classification prop al 
gramming appr m, and 


h for solving binary 
thereby take ae i to the arena of support ‘be pancia aeo The suppa 
vector machine algorithm has been developed by ae | : | and 1s based on statisti 
cal learning theory. The SVM algorithms are non-pal ametric or data driven technique, 
which use minimum assumptions On the internal dynamics of the models, Some of th 

d data driven approaches are multi layer percepiny 


other popular nonparametric an | 
regression (PPR). The major advantage of the SVM 


(MLP) and projection pursuit | | 
approach over other approaches (e.g. MLP and PPR) is that it results in a CONVEX pro 


gramming problem. These convex programming problems are generally structured and 
efficient algorithms are available in the mathematical programming literature for their 
solution. Further, because of convexity, we always obtain a global optimal solution. 


1 in predic 
the relation betw 


16.2 Binary (Two-Class) Pattern Classification Problems 


By a pattern (data) we shall mean an element of R”. Let there be m patterns having class 
label +1 and k patterns having class label —1. Let A (respectively B) be the collection 
of all those patterns having class label +1 (respectively —1). Then we can construct 
matrix A (respectively B) of order m x n (respectively k x n) by taking the i!" row of A 
(respectively of B) as the i” pattern of class label +1 (respectively class label —1). We 
Pei ne os notation A (respectively B) for the set A (respectively B) as well 
oe = E ae B) and the context will specify if we are refereing to 
The ° . . 
ae 4 ae to determine a criterion for distinguishing 
sets A and B. The decision surface which me — se ie epee 
hyperplane or a surface described b E the separation could be in the form ofa 
y a nonlinear function. 


Definition 16.2. ; 
be linearly A r ad gk ne pee) T Two sets A and B of Ranan nial 
’ S @ hyperplane w Xx=b,w € R” beR such that 









vely are the 
- vector of ‘on 


+ Bwe<eb. 


os Ca eia ty corresponding to the patterns of class 
eparable the, 2p ate dimension. be 
«il they are called non-linearly separable. ji i 

arly and non-linearly separable datas?“ 


. =f á 


P 
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Lemma 16.2.1. Let A and B be two finite sets in R”. Then A and B are linearly sepa- 
able if and only if their convex hulls are disjoint. 


Proof. As A and B are finite sets in R”, their convex hulls are closed, bounded and 


lass convex. The proof then follows by employing the standard strict separation theorem 
Ee (eg. Mangasarian|110)). 

ruct | 

of A Example 16.2.1 Let A = {(—1,0), (0,1), (1,0)} and B = {(0,0), (1,1), (0,2)}. Are A 
We and B linearly separable? 

: ee ees 
ae | Solution As the convex hulls of A and B are not disjoint, the problem is NOT linearity 
; E dj Sata blem is linearly separable, 

| in deciding if the given pro i 
ing | | Though Lemma 16.2.1 is Be mt A en finding convex hulls is not an = 
We i much Exprtica: pee of linear programming to decide if the ipar 3 
= ie — ae that by a suitable scaling, the set of inequail 

and is linearly separable. We 
(16.1) can be rewritten as 
| to 


Aw zeb+e 
(16.2) 
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y 1 = Min(a;) , ai > 0. 
aoj BORA 


j=l 
Similarly Bw < eb means 


n 
Yoa for r=1,2,.../k 
Ell 


Le. ” 
Y bjw s b-p OS... k, Bp > 0 
j=l i 
i.e. 
n 
3 b jwi <b- By pi = Max(B;), By > 0 (16.4) 
j= i 
an : 
If we now take œ = Min(a*, B*), then (16.3) and ( 16.4) become 
Aw >eb+e 
Bw <eb-e (16.5) 
a A ae and F ; /& are still denoted by w and b respectively 
les 1n 4) are ‘>’ and ‘<’ t : 
, 7 laa ype, rather th oS? 2 ox 9 f 
natural to apply linear programming for the system (16 5) Ben t 


E r . . . s . 
rror Minimizing Linear Programming Probl 
em 


For a real ny 
É R” A T a, let a, = max T 
AE EE G3 re rito x, = (Ca ee 
vector x € R” ie. | , h : "Artei Also let | lll denote the Lı norm of the 
oS Ia <= BI. 
© A Ml We now consider the following optimization proble” 










(2,0). Then for x ER" 






= 


p Y 


yro 
l aa 
E 
-_ + 
fh 
a 


hon” 
ô 


— 


hail T E 


(16.6) 
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ps in transformi Arly Separ; 
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p (166) 5 = 
gave" 


Snoldor ' hot. The 
9. Consider the problems yl he below 


pnma 16-2 
(1) Mi 
am (lg) tha + | A(x), Ih 


md 


(16.3) 


(II) Min {e! T 
yY tez: 
xES Z y Z g(x), y > 0,z > h(x) z> 0) 


en, g: 5s R”, n: 
Erg ™ h:S—> RÝ, yeR andzeR’ 
(I) have identical solution sets z ER.. Then b 

; oth problems (I) and 


We take y = j d 
Proof Y a e ha), oe 0). This gives y > g(x), y > 0. Simi 
, Z 2 0. Also || g(x) I= e"y and || Hts) on 

+ = € Z. is 

o 


raking z = (h(x))+ 
proves the Lemma. 
em . 5 . 
ma to the optimization problem (16.6), we get th 
6), e 


Now applying the above L 
gramming problem, called the error minimizing linear 


(16.4 
) following equivalent linear pro 
programming problem, as 


Min ey e'z 
w,b,y,Z m x T 
(16.5) subject to 
Aw-eb+yze 
isver | —Bwt+eb+z2e 
t 
yz zo; 
(16.7) 


þ are unrestricted. 


by using the simplex algo- 


B are linearly separable or 
ble. 


s linearly separa 


w and 


be solved efficiently 
ck if the sets A and 
fication problem i 


As problem (16.7) is a LPP, it can 
d to che 


/data classi 
Error Minimizing 


(16.7) are 
ue of the error 
ator 


ci and therefore can be use 
quivalently if the given pattern 








LPP 


ons from the 


_ Some Useful deducti 
from problem 


iy Some useful deductions 


ip 
fe 


ment: : j 
EA e A and B are linearly separable if and only tf the optim liner on 
ninimizing LPP (16.7) is zero. Then tne hyperplane w- 


a 
and tha A CEE, . . £ 
ae n sclassification erro! js Zero. 
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F | Í inearly separable), ther 
maoparable (ie. not linearly si , then the on. 
(ii) If A and B are linearly es (16.7) is not zero. In this case the sets 4 and pe 
value of ortor m ate hs lane and the separation can be done by a 


r Non); 
not be separated Bye ayps ni solution of (16.7) will provide a h mine 


Further the o Yperpla 
separator only. Fur , d B exactly but for whi ne 
Tx = b that does not separate the sets A an ‘ which the AVerag. 


‘Fration (i ‘ective function of problem (16.7)) ; 
error of misclassification (i.e. the objectiv i | (16.7)) seai } 
(iii) The solution (w = 0, b, y, Z) is an optimal solution of (16.7) if and only if — ep 
in which case it is never unique in W = 0. Thus there always exists a solution of A 
error minimizing LPP (16.7) with a nonzero W (Bennett and Mangasarian 117) 










Example 16.2.2 (AND Problem). Write the error minimizing LPP for the ay) | 
problem and check if the given problem is linearly separable. E S 
Solution For the AND logic, we have A = (1 1) and z 

0 0 

BS 0 il 

1 0 


Hence the error minimizing LPP for the patterns A and B is 


l 1 
Min Yı + 3 (41 + Zp + Z3) 
subject to 


W1 + W — b + Mi Il 
b+z, >1 
—W2+b+2 >1 
—W,+b+z73 >] 
Y1,21,22,23 > 0, 
W1,W2,b unrestricted in sign. 


7 7 p O 99 3 
( a A ~ Nt] ee ta ee A 
AVOA U Of th ij A; 

d S ¢ — $ A 606 i 7 
> USA) tae Der eo = E if ie Ses e se c . l 

AT vou e p ‘ay Di) i S 
ALZ CO r FEKK off ond 
Pye J A Ji yee 7 & 
L H ES bed aS 4a) | À 


Ji =~ a $ . 4 

Nese convay 

LLESE CON) SX NU 
i am vie LIU 


S 


ly separable. 
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solution For the XOR logic, we have 
Fie Ve a0 
i Hi 
Oe oO 
satan? 
‘fence the error minimizing LPP is 
1 1 
Min 5 (V1 + Y2) + 5 A + Zp) 
subject to 
| w,-b+1 21 
W2 — b+ Y2 eal 
d . Ps paz Z 1 
E | Í -w4 — W2 + b + 22 An 


Zo. = 
Y1, Y2, Z], 42 estricted in sign. 


Wy, W2,b unr 
of the 
= e) is an optimal "setae arly 
I d (w = 0 b oe roblem i NOY 2 
=a J 
Dove ` = N h a Therefore the al solution of the a 
EP wit th | value as 2. ‘sts OP a A 
with the gs tainly exis hail 

e. Bu tw go know iha gan cer neal =~ ee OMe — 0.5. As the 
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A and B. but it will give _otvomairegen geometrically. 


Figure 16.3 illustrates these O 


| S=(1,1) 





x] 
a ESEO) 
2x +2 x70 2x +2 xFl(i.e. X ft X5 S) 
Fig. 16.3. 


The line segments PQ and RS are the convex hulls of the sets 
As these convex hulls are not disjoint, the problem is not line 


= 1, i.e. the line x, +% = 0,5, 

: x — (b — 1)| where 
On — Wp — 2,0 = Wand x = (1,1)', piyi - But as per our 
construction of the error minimizing LP misclassification js 


taken the average error of misclassification, i.e, 50 +0) + to Pay = 2 
5 


Let the sets A,B c Rr bye i 
/ Inear] ; 
hyperplane wx =p which eas r Sspanable., Then there certainly exists a separating 


| > P< ae p 
why e TN: ie att “ie ye E 
P JA 1 j) oF oy fy gon ) ] i Sf ie 


rs. 


J j f : ei | > L fi 
I Nor f oes t ba ae oe “ae i 
j U tp SOT) naa T PPT N] y 4 y K rp 
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” f bby | OQ EeT” ft a + 
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” wis i 


ia x - ta: ). Let A,B c R” be linearly separat 
aR ba, i cai a canonical separating hyperplane if! 


a : = > 
j a of J ) sa 
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Definition 16.3.2. (Dead Zone). Let A, BC R" be linearly separable and w'x = b be 


= sanonical separating hyperplane. Then there definitely exists a region {x : (b-1)< 
i x< (b+1)} c R” surrounding the separating hyperplane w'x = b which is void of 

: | pinis from the sets A and B. This region is called the dead zone for the separating 
ere 


in iperplane w' x = b. 


= ‘Definition 16.3.3. (Margin). Let A, BGR be linearly, sonang and w'x = b be a 
canonical separating hyperplane. Let w'x = (b — 1) and w` x = (b + 1) be the aa- 
| typerplanes which define the dead zone. Then the distance between these e ne ues 
| panes, namely w! x = (b — 1) and wx = (b +1), is called the margin for the separating 


| hyperplane w! x = b. 


g hyperplane wx =b. 





: ‘cal separatin 
ng ktis easy to compute the margin for a pe eek point on the hyperplane 
e si i stanc i 
a k E aote the aan hich equals 77): Therefore for the separating 
d #x=(b-1) and the hyperplane wx = b, W lwll of), Figure 165 gives a 
a 2 2 = (Wyte nl 
yn. Der in iS ——7: where lw 1 À e wx =b. 
Paplanew'x = b, the margin is Tj ™ tpe separating byperplan 
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Fig. 16.5. 


For our discussion in the rest of chapter we shall drop the word ‘canonical and refer 
to a separating hyperplane as a canonical separating hyperplane only. 

The obvious question now is with regard to the definition of optimality in terms 
of margin. Why should we be interested in that separating hyperplane for which the 
margin is a maximum ? There is a very sound theoretical justification for the choice, that 
involves some deeper concepts (e.g. V.C. dimension and structural risk mininiai 


. . . K 
from the statistical theory of machine learning. We refer to the interested reader 


tO 
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is 12,24 P). 
here exists w E€ R” and 


By suitable scaling we can write the above as 
wx yil for all i having y;=1 
ead wx) —b <1 for all i having y;=-1. 
This can be written in a more compact form as 


| yi(w" x” —b)>1 for all 1=1,2,...p. (16.8) 


Note that inequalities (16.8) are the same as the constraints of the error hie 
LPP (16.7), except that the error variables are taken as zero, beig ee em i 
linearly separable and wix = b is a separating hyperplane. We e O 7 pee 
our notation a bit for the sake of convenience. We have not formed m 


ints (16.8), and 
s together in constramts (16.8), 
separately, because we have taken all pe a = a CO ee ee 


the sign of the i" constraint gets associ 


% =1 or -1. 
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Problem (16.9) is a standard convex quadratic programming problem (QPp d i 
can be solved by a suitable QPP algorithm. Once a pane solution (w, b) of (16.9 ich a 
been obtained, the maximum margin classifier wx = b is known. This classifier hi E 
known as the hard margin classifier. There is one practical difficulty with QPP Ago ant 
has as many constrains as the number of patterns and hence may be extremely aa t vet 
to solve for large datasets. In the machine learning literature, special chunking h y 
algorithms and decomposition schemes have been developed so that not al] constrans s 
are included at one time. In mathematical programming, the standard approach t pea 


handle a large number of constraints is to examine if the dual problem can be soly 
efficiently. We attempt to use the same strategy for (16.9) and write the dual. he 
requires the Karush-Kuhn-Tucker (KKT) conditions for problem (16.9), 





KKT conditions for QPP (16.9) 


The Lagrangian for problem (16.9) is given by 






p 

1 i ae : 
L(w,b,a) = =w'w+ } ai(1 - yi(w"x — by), (16.10) vec 
i=l 1.e. 
where w € R”, b € Rand a € R are the vectors of Lagrange multipliers. Then the KKT be: 
conditions are sup 
Vol = w — y ayix = 0 (16.11) a 
i=] the 

> eE 

> = 2 ay; = 0 (16.12) 

=l 
OE G=... p) (16.13) ae 
17) oy Ae a s — myl 
i %>0 G=1,2,...,p). (1615) fou 


— Qot) 


_ =! W, b,a) be a solutio 

(W, b) is optimal to t eo ‘0 the above KKT system. Then, by the KKT theo?” 
ett oe PLE (Conve: : K 
6.12)-(16.15) we rogramming problem (16.9). Using the # 


C os a | 
je g a Ln 
3 ta m 
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9) he 





(16.14) 
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je patterns (0, 0) (1, 0) and (0, 1) —“etange multipliers mn tx = 2 


Which are Suppo ponding to 


rt vectors) are bound to be 


E, 


the index set 
Let S denote the index set of all support vectors, ie 
S= fi: xis a support vector, 1 <i < p} 
F 2a; >0,1<i<p} 
E (16.17) 
Then W = Y mgo because for i ¢ S, @ = l 
= p g » Ai 0. If we now pick a specific Support 
k k : : = 
vector, say, x ) then a > 0 implies yW x®) — b) = 1. This gives yw x -b) = 
. dss pr -b= Yg because y? = 1. This implies that b = —Yk +w x. Since there could 
| be many support vectors, in practice, we compute the values of b corresponding to each 
support vector, and then take the average of all these values. $ 
For any unseen or test point x?*), we compute Ww x®+), If wx?) -p > 0 then 
w! is labeled as +1, while if it is less than zero then xP+1) is labeled as -1. Thus once 
the classifier W x = b is known, the whole process of assigning label to a future pattern 
works like a machine. Since only the support vectors are used in the determination of the 
dassifier W x = b, the learning machine built around this approach is called a support 

_ Wctor machine (SVM). 

The SVM algorithm as described above c 
ip liers a, (i = 1,2, ...,p). Once a R Itipliers? Recalling our discussion 
uatence @ and b. How can be determine iet ae that KKT multipliers aj are 
nonlinear programming duality (Chapter 8), er Wolfe dual of problem (16.9). 
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known, we know the support vectors 


r a 






nes A 


= ER ameme 
> d 


= mnp as HE STN 8 S e = = 
ee = l 


ama E . 
q= To 





Scanned by CamScanner 


566 wumerical Optimization with Applications 
Wolfe dual of Problem (16.9) 


The dual of problem (16.9) is given by 


Max L(w, b, a) 


subject to 
VyL(w,b,a) = 0 


ea (16.19) 


where L(w,b, a) is defined in (16.10). From (16.11) and (16.12), we have 


p 
W = a ajyix” (16.19) 


i=1 


p 
and i, a;y; = 0. Therefore using (16.19) we get 
=] 


L = Seip + 3 Qi i(1-— yi(w Ty) _ b)) 


2 

Az ; 

ae 2. qiæjyiyjx” TOS y 3 Qi jYiyjx sO Geran? zh ay; + 2 es 
Anja j= = 

p TER 

y £, m= > De) ayya x0 (16.20) 
= i=1 j=1 
because “he ay; = 
jer 





1 p 
J by A aia jyiyjx®" x) 


1=] j= 





Q1] ' Re ys i Je Wy 
subject to 


—_ 


(16.2) 
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( ‘tion 2 of (16.21) is known, the values of T an Problem l ome Therefore 
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Margin ] 
problem: classifi 
(16. 18) 


solution. The linear separability can be Neked 


: eith 

geometrically) er by a ving the error minimizing LPP. wee ii 
hat sets A and B are linearly separable. We now determine net 
(16.19) by solving the ar problem as ie the dual problem. The p 
Min 5 te w2 


ng the convex hull 
the readers to verify 


ard margin classifier 
rımal problem is 


subject to 


wı + 8w: -b >1 

4w +5w2 -b >1 

4w + 4w2 -b >1 

i wı +w2-b >21 

—8w, —3Ww2+b 21 
W 1,W2,b are unrestricted. 


! 
l 


| We can solve the above QPP by Wolfe's method to get W1 = —0.47, i = Te 
_ b=-241. Therefore the hard margin classifier (i.e. the separ ea ane W 
(16.20) maximum margin) is —0.47x; + 0.12%2 = —2.41, i.e. 0.4721 — 0.12x2 = 2.41. 
The dual formulation (16.21) for the above problem 1s 


9 ~ pt Aw 
Max _6 a — 41 ar, = 16 a = a4 i 2 %5 : 
Aa: 1 2 — 9ana4 + 470205 ~ 80304 


—36a103 — 90104 +t 32.0105 — 360203 


+ a4 t05 
s44aza5 + 1lasas + Oi Te 


a + 04 — 45 ~ (16.22) 
“2 an + &3 
“ia a4 + Q2 : 0 i 
— = 0.12. Theretore 
panak 
tors, an 
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568 Numeroa n 
= zyx + A5ysX 


= (0.12)(1)(4, 4)" + 0.12(-1)(8, 3)" 
= (-0.48, 0.12)! . 


w 


Furthermore b=0 5(b3 + bs) where b3 and bs are given by 
ir > T = i T . . 


= -y3 + W x® = —1 + (-0.48, 0.12)(4, 4)" 


be 

bs = ys -W x = —(-1) + (-0.48, 0.12)(8, 3)! 
This gives b = —2.46. Therefore the hard margin classifier is 0.48x, — 0.12x, = 24 
which is the same as the one given by the primal problem, except for some difference 
due to numerical computations. 

By solving the error minimizing LPP, we not only check that the given problem 
is linearly separable but also determine the separating hyperplane. For our present 
example, the error minimizing LPP gives the separating hyperplane as Ay; Ary =] 
which is different from the one we have obtained by solving the hard margin Classifier 
problem. This example again illustrate that the error minimizing LPP need not yield 


the maximum margin classifier. 


16.5 Soft Margin Classifier 


Let {(x, yi), 1=1,2,...,p} be a finite training sample of patterns where x e€ R" and 
Yi E€ {-1,1}. We now consider the case when these patterns are not linearly separable, 
In this situation the error minimizing LPP (16.7) will have a nonzero ob jective function 
value, i.e. in the optimal solution of (16.7) not all error variables will be zero. Let these 
error variables be denoted by č; (i = 1,2,...,p). (Here again there is a slight change 
of notation from problem ( 16.7), in the sense that all error variables are denoted by & 
rather than having two different set of error variables for the two different classes). Now 


il p 
our amm is to find a classifier wlx = b for which the total error >a č; is least and the 
n Z. i=1 
Margin — is maxi T : - 
lwl] aximum. This is a typical scenario for multiobjective optimization: "° 
Bo Slisve to fn | 3 
M ‘ oe da classifier we =f for which bS th E at w w are minimum. Since | 
B 2 | 


a 


hi ae T i=] 










Seto In tenerai ma .._- j ; 
_ 2 a a We consider the weighted sum ZT +C E i ee 
iá y S t ; 2. A 
ae hosing ™” 
error and the margin, by sae 

yd < us tot - ET om : e Š . ro 

© ine tollowing optimization P 

a Ie ed — 
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subject to 


b) tS 
2] (i 
= l, 
S20, q- T 2 p) 
bn RET p). 
the constraints are same as ih the owna Cees (16.23) 
s ES Kelani 7 eet Mnimizino LPB ne» 
for the margin/misclassfiaction error will be indicate, (16.7), and our pref. 
erent C> 0 to be large, then this wil imply that We ar 1 by our Choice of C. jẹ kd 
` n x WEEN : ` € e n? 
> 2.46 misclassification error. In this case, our Classifier wry ake seater emphasis on 
ace AA training samples correctly, but will have a Smaller marein ear Classify here 
ee. as well. There can not be any specific rule for ta x yiye may not 
e 8 a ` me i | choice o E na 
oblem jpend upon how much tradeoff we wish to have between tie Areca wig it will 
i d the misclassification error Jectives, namely, 
resent he margin an z EA . ror, 
2 =o The quadratic programming problem (16.23) is referred as the soft margin classifier 
“Siler peause in this case, the margin is no longer ‘hard’, it depends on the error variable č; 
yield On the lines of the hard margin classifier. we define the Lagrangian as 
1 p p p 
L(w,b, E,a) = 5w w + C() Z + vial ~i ylw xO -b - PBs, (16.24) 
- =| i=] i=] 
goe. _ and obtain the following KKT conditions 
bl p 
pa i) 16.25 
‘tion Vel = wW- X aiya = 0 (16 ) 
hese i=] ; 
ange aL y = ot (16.26) 
f — = QE; ano 
y Či ob = 
ie | (16.27) 
oL = C- &i-—ßi = 0, = 1,2,..+»P) 
the T : (16.28) 
Ty b) 21, (0=1,2--P 16.29) 
al č; — yi(w x" — ©) 


a;{1 = Ġi z y;(w 


_— ry ) 
TY) — b)] = 0,4 = 1,2, p 
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| En 
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a bounded support vector. 


H 
a) 
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problem (16.23) l 
L — aj — Bidéi 
| P P yl (j Vp ee Qi + yc Oj 1 
1 A ite x0) = QiYi y - 
. Max z3 ) Aj YiY j* 3 i=1 i=1 
i=1 j=1 


subject to 
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ae 


p ae 
Max ya aa z. aja jyiy jx” xy) 
P. j=] E i fal 


subject to 
p 


Y aiy =0 


i=1 


ozm eC PEL... p). (16.35 
00) 





The dual problem (16.33) can be solved by the usual QPP techniques to get 7. One. 


p 
ak: — im Ty 
optimal @ is known we get W = aiyix and b = -yj + ()'x") for some J 8Uch that 


i= 


= 


0< a; <TC. 


16.6 Nonlinear Support Vector Machines 


The support vector machines discussed in Sections 16.4 and 16.5 are called linear SVMs 
because the discriminant function is linear (a hyperplane w! x = b). When the patterns 
not linearly separable, then depending upon the value of C, a soft Margin classifier 
wx =p is obtained, for which the error on the training set is the least. However, in 
many applications this ‘least’ error may be unacceptably high. In such a case we nel 
to p a suitable nonlinear discriminant function or nonlinear decision a This 
is achieved by the Kernel zai 
‘Salar y ernel SVM approach which employs the so called kernelization or 
En i the aon approach, let us consider the classical XOR problem. Here the 
& patterns are {(0,0),(1,1), (0 1),(1,0)} with {(0, 0 | 

: ; AAOC ,0),(1,1)} having class label +1 

and {(0, 1), (1,0)} having class label —1. We have seen that this problem is not linearly 








mappi 
oi pe FXO, ie (the space in which the training samples lie, whit 
ip NN ay that y to a suitable feature Space (a higher dimensional space R 
E the amen pe the patterns {4(0,0), d(1, 1), (0, 1),¢(1,0)} are 
SVM approach of Pion 16 to this question is ‘yes’ then we can apply the inet im 

. and 16.5 to the transformed patterns in the featur ; 
y $00, 0) g (1 ane by P(X1, x2) = (x p KIN E This gives 
can easil E 1), $4, 0) F (1, 0, 0), and (0, 1) a (0, 0, 1). It 


& 
. 7 
$ 







On z a. 
mreno ature space RP are linearly separable 
vt P "E i 
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Fig. 16.7. 


Let us suppose that such a ® exists (we shall inet: 
; . ustify j : 
then obtain the optimal hyperplane for the aoe iene ie yaen 
A . ] l eG 
a(x”), j=1,2,...,p} by solving problem (16.33) with x@ uae b K Yi) z — 
E 


p 


NEON i > aiajyiyje 20 


subject to 


iM 
= 
SS 
| 
© 


D ea SA (16.34) 


1 
g 


er product is a scalar and 


Here 20°20 is the inner product < ġ(x® ) p(x) >. This inner 
is termed as a kernel function K(x, x”) that maps the pair (x 
Figure 16.7 depicts the feature space and corresponding linear classifier. Re recice 
| Since the transformed patterns appear only, aş an "po S iy com 
function, by employing the kernel function, we can solve aa $ This novelty of the 
Puting z0, i.e. without the explicit knowledge of ts sae popularity and success 


oe el approach has been one of the main reasons 


„x0 to a real number. 
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; ) f- 
1 7. (i) Aj) 
A a AK, X 
Max Ya 9 A á COIN; ( ) 
j=] jal jmi 
subject to 
p 
$ aiy = () 
j=] 
0O<a;< (i 12,3 ov ff). 
(16.35) 


Let & = (@],...,@p) be optimal to (16.35) and S = {i : Qi > 0}. Then 7 = Praa ol) 
Now for a new pattern x € R”, we find f(x) = (Ax) w — b where b = -yj + aah a 
for some j such that 0 < a; < C. Therefore, substituting W = } jcc Biyih(x) be be W 
the definition of the kernel K, we have Ing 


f(x) =} myiK, x) - b, 
IES 


where S = {i : a; > 0} and b is given by 


b= -yj + $ = K(x, x, 


IES 


for some j such that 0 < 7; < C. I 

j < ~. 1n practise b may be chosen as the average of all such 
ie f(x), we assign label +1 to x i) Ovand =] otherwise 

note here that f(x) does not require the function Ọ explicitly | 


For the above methodolo 
8y to work, we need to wri = 
some @ : R” — H, where H is a finite or infinite Oe MOY) =< $(2), Ọ(y) > for 


with the inner product 
ees Aj uct (mostly we take H = R” (m 


It is important to 


heorem). Given q function K : R" x R” — R there 


nd an 
“very real valued function g:R" sR ln = P oy) if and only if jor 
~ œ we have 


[key 8(*) g(y) dx dy > 9. 
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e Gaussian Kernel. G as def 
l aled . the linear SVM studied in the ned above, is 
| ove Ka, 2) = (x0) 7x2) 
We now make another interesting Observation Qi 
A of H nor the choice of @ is unique. For ea a kerne] function K, neith 
x e can choose @ ; R2 B nas *Ple, Consider (0-40 oe ic ene 
9.35) a nN Then W > ma R Liven by ola, b) gi ja? 3 1) = (x y}, Let 
f Key) =< CAN vIn E R°. Another choice of $ and H is o = i 
ot ee TE again Kiley) =< oil, Pilys a — 
the Brample 18.6.2 Zet A = {(-1,0) ,(1,0)) and B = 10,0) 
ng „ass labels for A is +1 and for B it is —1. Show that A an 
jse a suitable mapping @ to determine the classifier. 


| be two sets in R2 


such that 
d B are not linearly 


separable., 


Solution. The convex hulls of A and B are not disjoint and hence the problem is not lin- 
early separable. Let us now take a mapping $ : R? — R? given by (a,b) = (a? , Va , 1) 
Then, P(-1,0) = (1, 21) (0,0) = (0 ,0 ,1), and (1,0) = (1, -V2,1) and these trans- 
famed patterns are linearly separable in R3. Therefore we can formulate the problem 
in R’ to get the hard margin classifier 3(a, b) = 44 (a,b). Therefore 3(4, b) = 40 (a,b) 
where (a,b) € R? and (a,b) = ((1(4, b), p2(a, b), 63(a, b)) means 1 = 472 which translates 


; e k. 

ieh to the nonlinear decision function 4x? = 1, i.e. the pair of straight lines x = += in the 
_ input space. We can also visualize this solution geometrically as given in Fig. 16.8. 

to l 
| 

Or 
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| ; 35) has the formulation of 

firmezation problem ( 16 a soft 
Remark 16.6.1 The optim ted that in the feature space the problem will he 


. h it ts expec 
classifier even thoug ven after using the kernel K we can not be certain th 


separable. This is because e | | . be ce at sy, 
separability has been fully achieved. Possibly the degree of separability is greater iy ci 


feature space than m the input space. In general, if the problem is linearly separable 
approrimately so, it is better to employ a soft margin classifier to avoid overfitting, 


; r 
linearly 


16.7 Summary and Additional Notes 


e The contents of this chapter constitutes a very brief introduction to SVMs for binary 
data classification problems. 

e Sections 16.2 to 16.5 are devoted to linear SVMs while nonlinear SVMs are dis. 
cussed in section 16.6. Various issues related with Kernel SVMs are discusseq and 
importance of Mercer theorem is emphasized. 

e Two basic theoretical tools of mathematical programming, namely KKT conditions 
and duality, play a very major role in SVMs and this has been demonstrated time 
and again. 

e Small numerical examples in R* are discussed to clarify important concepts about 
SVMs. 

e Though our discussion has been restricted to binary data classification only, SVMs 
have been applied to the multi class scenario as well. 

c Support vector regression (SVR) is a SVM based approach to the problem of regres- 
sion or function approximation. However, this topic has not been included in this 
chapter. 

e Some useful tutorials on SVMs (both for classification and regression) are those by 
Burges [29], Gunn [71] and Sastry [139]. 

° ae eatin foundations of SVM were established by V.N. Vapnik, and are 

e Some excellent cota ae on cn 
a rc lise), The : are those by Cristianini and Taylor [42] and Taylor 
i genre TELErences give a very detailed account of kernel SVMs, 

“a theoretical as well as an applications stand point. 

7 FA si uation, the number of patterns may be very large but usually the number 
support vectors is rather small. Therefo ipli 
usually ze : ; re a large number of KKT multipliers a! 
= etally zero. This observation has been expoited f fficient 
algorithms for SVMs, such as th ‘ xpol ed for the development of some emc! 
tiene. 1 : ose by Keerthi et al. [93], Joachim [86], Mangasari®" 











Teo | MO eae troduced in the literature e.g. prorimal sup r 
O Saa Mangasarian [60] n 
7S eN  {60]), least squares support vector macht 
Generics) >... | ector ™4 
~ “4 Egen value proximal support Y 
wleage based support vector machine 
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al (62). Khemehandani et al, [95] ) ay 


Mathematica] p 





sse de 1d twi 

Angin, jg3}). Some of these developments heaven bee SUD Dort Vector 
early yemchandani [94]. en discussed i Mi ch 
weet Some structured convex programming Proble i ang (104 
i the i nd second order cone programming (SOCP) ms, eg, semidefinit be 
€, or e problem of optimal kernel Selection Sor ave been used jn a gun amming (SDP) 
x i 1 R l 

ot al. [100], Fung et al. [61] and Khemchand; ar he relevant referen ia iterature for 

Some standard software packages Arë LIBSVM [96]. Ces are anckriet 

i penchmark data sets to demonstrate the effici [41], SVMLIGHT[87) 

ment on SVM, are available at WWw.supp oe Of any new algorithmic d 
lary nttp://www.ics.uci.edu/ mlearn /MLRep ository ht) neg and evelop- 
dis- 
and 16.8 Exercises 

— 0,0 , 2 7 \V; = — 
Sees 16.1 Let A = \(0, 0), (2, 0) (0,2)} and B (0, DI 0), (0,1)} be two sets in R2. 
im. 
e 1. Sketch the convex hulls of the sets A and B and hence conclude that these are not 

linearly separable. 
Png 2. Write the error minimizing LPP and solve the same by the simplex method. Identify 
Mi the separating hyperplane and the associated misclassification error. Can you guess 

i the answer geometrically? 
3, Explicitly write the primal and dual QPPs to determine the soft margin classifier. 
es- 
his 16.2 Let A = {(—1,0), (0,0)} and B = {(1,0)} be datasets with class labels +1 and -1, 
respectively. 

b ae 

4 l. Use the error minimizing LPP to show that the problem is linearly separable. Hence 

determine a se ting hyperplane. : 

r | parating hyperp } aration: timal? 
ne Dethe s eparating hyperplane obtained by solving the ennor minimizing LPP optim 
or ~ Verify your answer by finding the hard margin classifier. re: 

a e li 
Ss, | te 3 = 4)} be two sets in R2. It is claimed tha 
Cet A = {(2,0),(0,0)} and B = {(4,4)} be “all 
o «S34 Eo Verify this clum analyticatty. 
P ts the hard margin classifier. Veriyy h and [e+ y)+ll and 

es 6 me lelh; I)a 2 
rd = Paii Let x = (0,2, —3)! and (-2, 1,4)". De 9 is a Mercer kernel. 

— | Eth. Show that the polynomial kerne nyo datasets with class 


alg 
>m | i i? 
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yathemat ical P rogrammin 


çinancial Mathematics 5 Applications in 


17.1 Introduction 


Modern finance is a flavor of recent times. Wi 
Brae Kavoshi Ita; 8 With exceptional contributions comino ; 
from econo yoshi Ito in 40’s and 50’s, Harry M Ons coming in 
Schol d Rob ; ? y M. Markowitz in 50’s. Fi h 
Myron Scholes and Robert C. Merton in late 60’s and early 70’s, and a 
ject of financial mathematics has witnessed an explosive Nepo” r the sub- 
l i ' ie 
cusses on applying mathematical or numerical techniques on the robles a 
l f l ; arising in 
JA oT e necessity to understand the theoretical and computational as- 
pects of the 2 ject for the SUC of any organization makes it more appealing. Now a 
days the subject commands high level of attention among the researchers and students 
across the globe. 
The aim of the chapter is to take a closer look at a particular problem of finance 
called portfolio selection from optimization viewpoint. The theory of optimal selection 


of portfolio was developed by H. M. Markowitz, a US economist, in 50’s. He shared the 
and W. F. Sharpe in 1990 for his pioneer 


Nobel prize in Economics with M. M. Miller | s 
work in portfolio theory. Here, we present a brief description of the models a relate 
them to quadratic programming problems (QPP) through mean-variance analyst 


17.2 Technical Terminologies 
n the chapter: 


tity from which the 
sset. It 1s controlled 


ologies often used 1 


a 
eat 
ty nn ad 
ma Saa 


Let us first get familiar with some termin 


ee | abl 
Definition 17.2.1 (Asset). An asset 18 @ ee s 
=; Ure economic benefits are expected to flow A fas or other 
Ue adi resu ical OF 
le owner who legally acquired it as © oly few of them, ase 
i ne Here we ist OMY” | oo, These ae 
teats can be classified in many categories: . ag 
Akt WIG ©. LASOMICL in ] s tat e machinery, 


= s "E i 


e economic en 


owner 0 the a 
t transactions 









= = 
, 






es = 
Å J j r 4 ESA ` 
ale, 2 
| Y wes 
P "5 es 
T g ans, ie : 
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ts in accounting: Intangible assets are defined as th 

ie | i i 

physically measured like patents, goodwill, com aa" 

| assets include cash, bonds, shares, mutual fu, 
tals are assets that are both tangible ds, 






also known as capital asst 
monetary assets that cannot be Aripa 
knowledge ete. Liquid or financial a: ie 
currency etc., gold and other precious me 


liquid. | ; the financial asset on] + AS a 
„et always means the nne set only, 
shapter, an asset always 
Throughout the chapter, 


Definition 17.2.2 (Return). The return on an asset is an indicator of gain Jive 3 
e me l i i 
the investment of an asset in the financial market. It is determined by the followin, l 


formula 
amount received — amount invested 


a amount invested 


Suppose that the current price of an asset is A(0) and after T time period the asset 
is sold off at an amount equals A(T). Then the return on an asset for T time period iş 


given by 2 A(T) - A(0) 


A(0) 
The positive value of return on an asset signifies gain while the negative return 
signifies loss, zero return means neither gain nor loss from the investment. 


Remark 17.2.1 It is important to mention here that definition 17.2.2 is given in per- 
centage and hence return on an asset is actually to be understood as a rate of return 
on an asset. However, we shall continue to call it as return on an asset for consistency 
with the financial market glossary. 


Definition 17.2.3 (Risk). The risk is often defined as the degree of uncertainty of 
return on an asset. It signifies the possibility of loss in the investment. 


The risk can either be zero, implying that the asset is risk-free, or positive, implying 
the asset is risky. If the asset is risk-free then the future value of the asset is known 
with certainty otherwise the future value of the risky asset is uncertain. The financial 


asset can thus be classified as risk-free asset like b isky asset 
fits haaa eae e bond or fixed deposit and risky 


There are two kinds of risks ass 
unsystematic risk. 





ociated with a risky asset viz., systematic risk and 


2- 







i trì mar ix a J 
„n me, the outcome of unfavorable litigation, 3 0 


* OF a company, a natural catastrophe ete: 
ate X T ir 
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eure can be eliminated throu 
LON. as ney in more than one asset. ; 
tive or Y Investing th 
ids mn 17.2.4 (Shor onigi 
ca, pefinitio® i ots by Selling), It refers t | = amotnt 
and y not own an asset ut he establishes q Marker Sity 
sation that the price of that asset will fall. w Cl position by ON investor a, 

sition Mathematically, this situation re © say that the ; Min 9 On asset stan 

e investor as negati © explai isi “pi 
pned by th egative. med by taking the $ taken a short 





18 Ly) j 
h ; l Mane}; 
g diversification h al Mathematics x 
l 


ation whe 


"d 


n 
i number of 
n — +3 à ° asset; 
> (i) A e position in risk-free asset simply mea 3 
some interest rate or borrowing rate, repayin E borrowing cash from th 
pter stage and close the short position 5 that loan along with the ; a. 
s ° i , in erest at 
(ii) In case of risky asset, a short position is realized b sh : 
et borrows an asset, sells it at some price, Say So. The ais ar Selling. Here, an investor 
' j r iti ] 
is investor buys back the asset at, say S1, and returns it t pee IS Closed when the 
&- 51 is the gain/loss of the investor. > fae owner. The difference 


Remark 17.2.2 Short selling is generally considered to be very ri 
a ery risky. If the price S 
of an asset n the short position of an investor shoots up than the invest i d 
on loose enormously. Many perceive short selling to be a cause of market w ae 
short selling is prohibited in certain markets but of course it is not completely forbidden. 
xd However, the very notion that, a short seller can cause a permanent fall in the share 
prices, itself 1s debateable for any security which is short sold is to be bought back and 
J hence there is no permanent supply of the shares in the market by the short sellers. 
Many economists strongly believes that banning short selling does more harm than good 
to the market as in that situation the market prices are controlled by the alleged ma- 
rf : mpulators and irrational investors. Till some time ago, m India, short-selling was only 
available to retail investors (an individual who purchases small amounts of oe. 
himself/herself, also called small investor) and not allowed to ihe ele ai ie 
| (entity with large amounts to invest, such as investment ilk ies at funds ) 
terages, insurance companies, pension funds, investiment bone change board of In- 
a very recently guidelines have been issued by secur i without owning them under 
fia (SEBI) that enable institutional investors to sell stocne 
_&rtain rules. 
ai 


aoe | t. 
=A em; 
DOSI Ha, 


SEU = — US 


nuestor 1s 












the i 
ark 17.2.3 If the number of angel rs M weer: 
POSITono +h. : è en 

Ee laen the investor is said to have ta gnancial scenar 


a] model of the 


jo we make 
k real world 
‘to build a mathematic 
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Fi 1 asset. This assumption is known eM 
vestor can own a fraction of an ass a lis 


3. An in n demand in any quantity at the market Drive 1 


4. An asset can be bought or sold o 
assumption is known as liquidity. i 
5. There are no commissions/transaction Costs. 


We are now in a position to move towards our main bs vy portfolio optim, atio 
We first define a portfolio and then consider two-asset and multi-asset portfolio theorie, 


in the subsequent sections. 


Definition 17.2.5 (Portfolio). A portfolio is a collection of two or more assets, sy 
Ai, --- „An, represented by an ordered n-tuple © = (“1, vie /Xp)y Where xX; ER ig the number 
of units of the asset a; (i=1,...,n) owned by the investor. 


We consider only a single period model, that is, in between the initial time taken ag 
t = 0 and the final transaction time taken as t = T, no transaction ever takes place. 

Let V;(0) and V;(T) be the values of the i-th asset at t = 0 and t = T, respectively 
Let V(0) and V(T) denote the values of the portfolio © = (x1,...,X,) at t =0 and t = T 
respectively. Then, we have | 


V0) = ) xiVi(0), 
i 

V(T) = ye xiV;(T). 
i=] 


Definition 17.2.6 (Asset Weights). The weight w; of the asset a; is the percentage 
of the value of the asset in the portfolio at t = 0, i.e. 


_ _xiVi(0) 
n 
$ xvi0) 
i=1 
It can be observed that wy +--+. + Wy = 10 


Remark 17.2.4 In a portfolio, if w; 
taken a short position on the i-th ass 


Let fi 


< 0, for some i, it indicates that the investor has 
Cb fi;. 






be the return (definition 17 -2.2) on the į- 


i= Vi(T) - V;(0) 
gi Val) 


TOETER th asset. Then 


a 









i 
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Ve purchase onr nt 41 and ap with initial prices V10) the 
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porth of p (0) xı V1 (0) TEN Aneiaj M 


a — 


k 


ns b TA 3 "HO J to 
ese "ity, pi of Rs 1250 between two assets is w, = (25)(20) Rs 1259, The a ties Sag 
i is ) . 





E e year, pppose the assets retury val 1250 = 40%, and us ~ (50) (ise Of alls. 
aker ortfolio worth would be y ues are y da 5) 
en the P 25 (1) = (20)(30) 4. 1(1) =Rs 30 ang 4250 = 0%, 
= = 20%, : 45 ~ 50 ) + (15)(45) ap. gtd Va(t) = 
= s = O, and ro = L ) =Rs 1275 21 ) =R 45 
“ation pn 25 2 ti | 50 = 25%. - The return . 
‘heories pue 60 our assumption, the return value of = 


y s an rT ASQ ` 
talk about the expected value Li of the y asset jg 


cL od 
a random variable 








can "ai 
À « return rj, ie and thus 
a es. PE Se ii n) 
UMber f : poeng : 
The risk associated with the asset is given by the vari 
ance 
ke o* = U ; of the return, i.e, 
n as 1 ar(r;) (i = iL, A n) 
ce. /N). 
tively gample 17.2.1 Suppose there are two investment opportunit 
t=T pllowing returns in two different market Scenarios mes Oy and Op giving the 
are an BN ab! 1 ; f 
Scenario | probability of scenario _return:O reium 0, 
3 | ) 35 
TE nee 8% 12% 
w2 | 0.75 11% 9% 


Which of the two is a more risky opportunity? 
= Solution Here 
ty =E(O;) = (0.25)(0.08) + (0.75)(0.11) = 0.1025, 
ntage lo = E(O2) = (0.25)(0.12) + (0.75)(0.09) = 0.0975. 


d= var(O;) =(0.25)(0.08—0.1025) + (0.75)(0.11 -0.1025)* = 0.00016875, 
95 =var(O2) = (0.25)(0.12— 0.0975)" + (0.75)(0.09-0.0975) =0.00016875. 

iti j ted return on the first 
Thus both opportunities are equally risky although the expecte 
portunity (10.25%) is higher than that on the second opportunity (9.75%). 


13 Two Asset Portfolio Optimization 


q> with weights W1, 
r rtfolio expected return 


wp, returns 1, 12 and 


has 
u and port- 


Consider a portfolio with two assets, say, 41, 
mi dard deviations 01, 02, respectively. Then the po 
= Yalance ø are respectively given by (17.1) 


nN 
: pe 5 w _ 
= u hd 











u= EOT + wre) = sa jj 
8 = var(wn + W212) = 11 vd the value of P lies 


. d 12; 
i en m aD 
YALU cocmciehe “> aa? ait a fect o p on 


f = 4.7 4 x7 -y D k 
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that the value of p provides a measure of the á 


What we = Eranio pate to reduce risk. The more negative the valye of p i 
Sa a the benefits of the portfolio diversification. : 
As w , and Wz are weights representing the proportions oO total investment in ea 
As w W2 “a „n +w = 1. Moreover, in case of short sai: 
ets a; and a, respectively, we have Wj Lk ee selling 
the conte can be negative. Subsequently, we write @ = 1—s, and so, w =sṣ, . io 
Now. it follows from relations (17.1) and (17.2) that 
= = S) uy + SH2, (17.3) 
c=(1- s)?0- +s’o5 + 2p(1 - S)S0102. (174) 
Relation (17.4) can be simplified as 
o? = (F +05 - 2:90102)s* — 203(01 — po2)s + g: (175) 


Without loss of generality we assume that 0 < g1 < 02. We discuss the following two 
independent cases. 


(7) p=1, Da <p <— f 


Case (i). p = +1. From relations (17.3) and (17.5), the portfolio expected return and 
variance are respectively given by 


u=(1—s)u, +su2, 
g = |(1 — S)d; + S02]. 







Fig. 17.1. 
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4,4} both Weights are non : short 
Ga, | BO negative and thus the portfolio has B° 
ail i < } E 
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tort. It may be noted that an pnm 
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Fig. 17.2. 


From Fig 17.2 we have the following observations to share. 


Remark 17.3.1 (i) When the returns of the two assets are perfectly positively corre- 
ted, ie. p =1, the higher expected return of the portfolio comes along with the higher 
risk Furthermore, the risk of the portfolio can be completely eliminated by taking a short 
_ Wsition on asset az (point P in the first figure of Fig 11.2). ME E 

li) When the returns of the two assets are perfectly negatively Ka ; ee # ls j ty 

the expected portfolio return increases with the atl the risk is completely 

Portfolio (point A to point P in the second figure of Fig vith higher risk as weight of 

diminated (point P). After that the higher return comes w 

"skier asset increases. 

Below we provide mathematical justific 
Pk n p = 1 and o; = 02, then Omin = 91° 
When p = 1 pnd o1 < 02, then we have 


J ee a. 


observations: 











ation of the above 


eal S) 
ga (i—sron tent 
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17.7) is a quadratic equation “ncial Mathematica ie 


There, ( 5 in S re 


> 4 : sn ITCHE i 
wish to minimize o*. Now, | nting 


a parabola dani 
depicted E r T 
; In} Fig 17 9 Ww 
naie s}, (E 
do? 5 
we? (T! oa ap 
ds = () => g= 1 P0104 
As 7 
J 2 ; 


d 
ji d?o? 


Sag iy 
ds? A((01 = poz)? + or 02) 


hor 
> 0. 


quently, the minimum wv: ey a 
Conseq y; n value of OSIS given by 
Li, 2 
Di ta 0405(1 — p*) 
min 9) 2 3 
01; + 05 = 2p040 


Moreover, the minimum expected portfolio return equals 


min = (U2 — 1)Smin + lr. 


Remark 17.3.2 It is important to take note of the following points. 


ae 01 . ; 
(i) The condition -1 < p < g, 18 equivalent to 0 < Smin < 1. Thus the minimum 


. > > 2 
Ortfolio risk can be achieved without short selling. Also, 
Dott = 


2 2 


o 
(ii) p = — © Smin=0 © Onn FIL 


02 
(17.7) (iii) The condition s < p < 1 is equivalent tO Smin < 0. In this case the investor 


has taken a short position on asset 42 in order to minimize the portfolio risk. Further, 
. go. =0 p= Iig 
mun 


of this section is summarized in Fig 17.4. | F 
The risk-return relation of two assets for various values of p provides us with a 


triangle AAPB. The points A and B signify undiversified portfolios. ai m ah = 
| AAPB specifies the limit of diversification. The risk-return relation for all va p 


except +1 lie within this triangle. 
Let us verify the two-assets port 


j ding (0, H)-81 aph. dard devi- 
and plotting the corr espon £ ee Lae tc 12% and 16% and stan ar of 
Let A and B be two assets W1 P umber of units of asse 


note the n 
ation 16% and 20%, respectively. Let XA and xg de 
and ass et B 4 respectively, in a portfolio. 


The entire theory 


sidering the following example 
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While if p = -0.5 then 60 are 16% and 20%, respectivel 
: : é. ) y: 
“at anderline principle is thus that ‘do nop (orion? is advisable for reducing the risk 





the risk : 
e risk reduces as P moves from 1 to -1, i.e. the 
return relation. | 
ae assets are better than the others. For 
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Fig. 17.5. 


of the assets A and B vary from 100 to 0 and 0 to 100, 


respectively, the curve between risk and return moves from extreme right a 
point P) to extreme left and turns to move from extreme left to extreme right (nort 


ast point Q) corresponding to the various values of p between - -] and 1. 


In Fig 17.5, as the weights 









> | 14 Multi Asset Portfolio Optimization 
theory. 
l n - asset portfolio 
n this section we extend the two-asset portfolio theory ate are written in the vector 
f The weights of the various assets 41,-: n M ae +e twn = be expresse 
í Im collw,..., wp). Lete = (4 1) eR” Thon ere T 
, f sdy m i ni , Wn). ) be the expected retur” ve ce covariance matrix WI 
(= ee c= [cij ] denotes the "* a 
.,n), and i Note that ci =% = K i invertible 


RAE rir) Gle es a 
ees. Also, Cisa a positive 
pected pur. uf of the port 
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n 
nN T 
i P oon iit SS ) 
u= | 7 = ) Mitt Ce wey 


j=] i=] 


and the risk o* of the portfolio 1s 


n n 
E i 
g? = var Wifi | = ) er oa Read = w Cw. 
(17.8) 


i=] i,J=1 


During a portfolio selection, every investor is faced with a choice of either minimiz; 
a risk with respect to certain value of return or maximizing a return with réepect # 
certain value of risk. : 
Now, from (17.8), we observe that the portfolio risk ø depends on three factors. yi 
(i) risk of each individual asset; a 
Gi) coefficient of correlation between assets returns; 


(iii) weights of the assets. 
Out of these contributing factors, the only factor that an investor can contro] is the 


weights of the assets. Our main aim is to examine the optimal choice of these weights 
Consider the n-dimensional hyperplane e! w = 1 in which the weight vector w rede 

Let f be the mapping that takes each weight vector in the weight hyperplane to the 

corresponding portfolio point in the (ø, u)-graph. We try and find the image of in 

straight line in the weight hyperplane eTw = 1 under the mapping f. 
The parametric equation of any line in the weight hyperplane is of the form 


l(é) (Sie E OSE + bn) 
és +b, OLE L 


where s = (sj,...,s,) and b = (b1,...,bn). Let w be any point on this line. Then 
HE = mw 
= m'(Es +b) 
= €&(m's) + (m™b). 


Let a = (m"™s)', g = ; 
et æ = (m's) , B=- (m™b)(m™s) *. Mhen, € = pa + B. Moreover 


2 
Ofer 


= (é5+b)'C(Es +b) 
= (87Cs)E2 4 (TCh 4 b™Cs)E + bTCh 
See OE 7 

Substituting the value of & we get 


o° = yau + B)? + O(a + B) + n. (17.9) 
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č varies from —oo to co, the ordered pair 
07.9) which lies in (0, #)- graph with axis parall N 
We are actually interested in (0, u)-gr ; 


2 
O`, U) traces 
A oe out a parabola given by 
or TIINA ji and sides open on the ri ht 
e square root of o2, the res ne , 
sulting 


curve is 
g = y(au + B)? +ô 
au + B) +n. 


(17 ; 
8) This curve is called a Markowitz curve. Thus, each line ; 
mapped onto a Markowitz curve. This phatewiets is deni x 
is depicte 


(17.10) 
the weight hyperplane is 


a d in Fig 17.6. 
SGR to W? 
u 

: viz., wI + Wo + = 

x 2+W3 = 1 Markowitz Curves 
S the je 
shts. PPK 
sides. fo a i 
> the j 
n Weight Lines 


W3 Fig. 17.6. 


rtant to note that the Markowitz curve (17.10) is not 


a parabola. In fact the main difference between the parabola (1 7.9) and the Markowitz 
curve (17.10) in (o, u)-graph is that a tangent can be drawn to the parabola ( Í 7.9 ) from 
any point on the u-atis, whereas the Markowitz curve behaves almost as a straight line as 
u — œ, thereby, it is not possible to draw a tangent to the Markowitz curve as [b> œ. 
This difference may not sound significant right now but it plays a vital role when the 
portfolio consists of one risk-free asset. We shall be addressing to this type of portfolio 
in the next section. For the current discussion We have assumed that all the assets m 


the portfolio are risky. 


Remark 17.4.1 Here it 1s impo 


ma 


affecting the risk g2 of the portfolio, 
tor is the weight vector W. In other 
s upon appropriate weight vector 


hat among all the factors 


be controlled by an inves 
an investor can decide 


A, 
ce 


As already observed t 


the only factor that can ~ 
words it is same as saying | 
to minimize the overall risk in the investment. 


1 A portfolio with minimum risk has weights given by 
hace 
aCe 











Theorem 17.4. 










w 
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Proof. The problem 1s ee iw 
subject to 7 
ew = l. (17.1) 
Using the Lagrange multiplier A € R, we minimize the Lagrangian 
L(w, A) = wiCw+A(1- elw). (17.19, 


Note that A is unrestricted in sign because the constraint in the risk minimization 


problem is an equation ew = 1. Now, differentiating (17.12) with respect to w. - 


obtain 1 
2w C- àe =0 => w= 5 Gi. 


Using (17.11), we get 


A A 1 
Tioclel= <i> Ss as 
; Sc e) BS 2i el Che 
Thus the requisite result follows. a 


Definition 17.4.1 (Markowitz Efficient Frontier). The set of points which provides - 
minimum risk for each expected return value u is a Markowitz curve called the Markowitz 


efficient frontier. 


Definition 17.4.2 (Minimum Risk Weight Line). Corresponding to the Markowitz 
efficient frontier in (0, u)- graph, the line in the weight hyperplane ew = 1 is called.the ~ 
minimum risk weight line. 


wW? u 
Wi + W2 + W3 = 1 Efficient Frontior 
vail . 


/ 


Markowitz Bullet 
2 D A 
VHH mee —> g 


* (Oaia Ban) 


Minimum Risk Weight Line 













ai P ane Fig. Ly oy 
i; 


Aa 
P 
A 


a ated P 
aba, + + ATA SR at hd h . . . ct 
nd the minimum risk weight line are dep! 

5 

a < a t 
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1 Suppose there are two ass | 
ole 17.4.1 ppo: l k 5 ets a and a with u = 0.4, u = 0.8, o? = 
= 1. Obtain the minimum variance point and sketch the entire efficient 


pxam 


= G, 12 
frontier 
solution Here, we can note that 
= 212 — i g2 = 0l- — 3 
oao 2 uD o? + 05 — 20402 g 
o? — p0102 1 


Smin — 


o? + o3 — 2p0102 py min = (u2 U1)Smin + 41 = 5 


Thus (Omin, Hmin) = (1.2247,0.6). The corresponding Markowitz curve and efficient 
frontier are Shown in bold in Fig 17.8. 


u 
(1.414, 0.8) 


(1.2247,0.6) 


(1.414, 0.4) 





Fig. 17.8. 


likely that an investor provides a fixed value of the expected 
hijeve. He has to decide the right investment strategy 
mum risk. We look at this scenario in the result to 


In many cases, it is more 
return, say H, that he desired to ac 
to obtain the return p with the minl 
follow. 

Theorem 17.4.2 For a given expected return p, 


weights given by 


the portfolio with minimum risk has 


TCU TC-lm 
mC é \—--1 m H \c-1 
Je m + det | el Clin 1 Je e 





ae Me (17.13) 
wF mcm m'C'e 
det| oTo1m eT Cle 
Proof. We wish to solve the following quadratic programming problem 
Min o* = wi Cw 
subject to 
if (17.14) 
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m with unrestricted variable vec 


is i t i ramming proble 
ses es praean e B € R, we minimize the Lagrangian tOr w 
sing the ) 
T 
Liw, æ B) = w Cw + a(u - m'w) + BO -e w) 
to obtain 
A == m + Be). 

ow- am = pe =0 => w=5C (a Be) o 


Substituting the value of w in (17.14), we get 


(m'C-\m)a + (mC le)B = 2y 
(TC ma + (eC eB = 2. 
Solving the above two equations for a and f, we obtain 
m'C te m'C lm u 
it ( eC m BN Chey 1 


h m_-C-!m mI Cte ¢ p 7 no m! Cim m! Cte 

det Cim eCe A TClm eT Cole 
Substituting these values in the expression (17.15) for w , we get the required ex- 

pression (17.13). o 


Definition 17.4.3 (Markowitz Bullet). Since the efficient frontier contains all the | 
points of minimum risk for the preassigned value of return, therefore all the feasible 
points of problem (17.14) lie on or to the right of this frontier. Due to tts shape, this 
region is called the Markowitz Bullet. 


It is worthwhile to pause here to make an important observation about the minimum- 
variance set for a fixed return. Recall from (17.14) and (17.15) that, for a given value 
of return u, the points of minimum variance must satisfy the following system of (n+2) 
linear equations in (n + 2) unknowns w € R”, a ER, BER. 


T . (17.16) 











Suppose we solve the system (17.16) for two distinct values of expected “en 


U. sav n: —2 2 

Bs ee Ír and ff . Let the two solutions be (w! = (ory ; aun), at, B»)! and ( E 
e 0 

mbinall 


‘ae g KTN espectively. Then it is simple to verify that the co i 
- i AW } 0 tag {5 } + “(af M p $ y 7 > a 2 2 H - - S 5 
ES alltel ,B°), A ER, is also a solution of the sy 16) 


usu + (1 Nii. Therefore, in order to solve and 


a EAR — = ae 
red to solve it for two distinct values oF F 
aan Oe nd 
7 7 eke = = | 
ina 
oe eis * ? 
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Solutions. Thus, tl 
j€ p i S how ce 
an form tl iding the minimum variances is sufficient to Senerate the entire minimum 
se viel . , i a SE Gey 
stfolios g This result is significant from investor point of view. Also, it demonstrates 
riance M tion of KKT optimality conditions. The result is known as the two fund 
v cå i 
app 
god 
as 
theorem 


combination of the two 


le knowledge of two distinct 


17.4.3 (Two Fund Theorem), Tu 
Theorem 


10 efficient portfolios can be established 
y other efficient portfolio can be dupl 
hat any Olver ©] 
so t 


. . ` r 
y | ( } IT t li 10) S that. an ınvestoi 
: ’ 40) f f j IY Y à 
as 


mati ets. 
| nly 1 > combination of these two ass 
icient portfolio need to invest only in the combinatio f 

“ing an ecient 

seeking 


\ o io istinct 
ions 17 s to assign two di 
nvenient way to Ber EWO solutions of erae A pe a choices are 
t COnvenient way to g amg Mhz st con i 
po 1 6, and then work out the solutions. The mo 
Ou ana D, c ; 
values t 


T ri pected 
anc x and expec 

th the covariance matri 

ampie Consider three risky assets with the cova 

A, onsider three | 

Ex le 17.4.2 d 

returns as follows. 


ar 4 7 


0.4 


; 0.8 
: 0.8 
0 


: nected returns 
T _ determine the exp E 
ma the minimum variance. Ag nstruct the portfolio giving th 
Find two portfolios deine” ‘a the two fund theorem, co 
ios. Usir 
3 portfolios. Usin 
from these two g 


mi isk. 
return of 33.4% with mmmum T 


= = 3 gh =] 
a roO* se. O;j0; 1 

g , need to solve: i=l = 

0,p=1im (17.16), we nee 

Solution Taking a = U, 


i j 4 ) ati né 
| su ting A y 


to solve 
= 0 in (17.16), 
= P, 
0.0 5)". We next take & B 
4 -pn ] =m (0.5, / ý 
The solution is V 


. ((=1,2,3), 1€- 
Dee aijo = Hi, (i 


T 
— 1 0.2, 0.3) s 
: y2 = (0. R 
‘The solution of the above system 15 
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i ia oe 
12) 1, a take w = V = (1/2, 0, 1/2). Normalizing ye We 
so that, Diet w2 = 1). The corresponding returns fon 


t the 
T —_ T 1 an ‘ ~2 es > 
1 and w* are ji’ =m w = 0.6 and pt = ml yp ~ 0733 


Note that Die 
get, w? = (1/6, 1/3, 1/2)" ( 
two portfolios with weights W 


respectively. Sik n s 
Next an investor desired a return of u = 0.334 at minimum risk. It is simple to 


that for A = 3, Ag’ +(1- A)p? = 0.334. Thus, by the two fund theorem the requisite 
portfolio is given by w = Aw! + (1 — A)w* = (7/6, -2/ By V2 Observe that the Second 
asset has a short position in this portfolio. The variance corresponding to this Portfolio 


check | 


’ pee "O 7/6 
TCw=( 7/6 -2/3 1/2 1 oa =2/3 |= 2/9. 
5 | | op 2 172 


17.5 Capital Asset Pricing Model (CAPM) 


So far we have assumed that all assets in the portfolio are risky assets. So it is natural to 
query as to what would be the scenario if one risk-free asset is included in the portfolio? 
In this section we make an attempt to study this aspect of portfolio selection. 

Consider a portfolio with n risky assets, 41,...,4n with weights W1,...,Wn and one 
risk-free asset ap with weight w,¢. Then 


n 
Wrisky + Wrf = Wj + Wr = 1 (17.17) 
i—1 


n 
= Wrisky = Ds Wi Sk 
i=l 


Also, the expected return and the variance associated with this portfolio are respe 
tively given by 


n 
u = > Wili + WrfHrf = Hrisky + Weel , 
i=1 


o* = var È Willi + z = Uar È ous = Orisky 


a i=] 






= h- € 
Iira ramin s A : 5 oft 
nt we remove the risk-free asset from the portfolio and readjust the pa sred 10 
ISKY Assets. Sí Y -hat hair erra a . - i j ele : 
et Sum remain 1, the resultant portfolio so obtained ma portfol® 
L risky portfolion. We nea ; j 
“y porijono. We use jider and oF 2 tO denote the derived rs 


TER 
ja 2a) 
rg ttr = 
= & fi p 
> 
=e 
g 











i 
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e i A 
3 i = Wrisky ee 
mL risky | T Weebl 


Wrisky Uder i Wrellet. 


n 
2 
O = var È Willi 
1=1 
n 
Wi 


2 
Pay e a (17.19) 
> i=] Wrisky 


risky ° der’ 
Using (17.16) and (17.19) in (17.18), we obtain 


a eee AG 


Uder — Urisky 
7 = = ape OMe 


= Lrisky ats 
O der 


(17.20) 
(17.20) is an equation of a line joining (0, risky) and (Oder, Hder) in the (0, u)-graph. 
Now, for a given risk gø, suppose we choose various weight combinations of risk- 

free asset and risky assets satisfying (17.17), we generate different lines represented 

by (17.20) in (0, u)-graph. Obviously, among all such lines, the line that ae ! 
points with highest expected return for a given risk is tangent to the upper por 


the Markowitz bullet. This is illustrated in Fig 17.9. 


4 
| 


u 
Capital Market Line 


AF- (o derı der) 


Lf 
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ket Line). Among all the lines (17.20) for var 


; 5.1 (Capital Mar f ‘nef 
Definition 17.5 (Cap t and risky assets, the line gwing the highest return 


weight combinations of risk-free asse | 
for a given risk 1s called the capital market line. 

The basic idea of the capital asset pricing model (CAPM) is that an investor cay 
improve the risk-expected return balance by investing partially in a portfolio of risk 
assets and partially in a risk-free asset. All investors will end up with portfolios alon 
the capital market line as all efficient portfolios lie along this line while any other 
combination of risk-free asset and risky assets, except those which are efficient, lies 
below the capital market line. It is thus important to observe that all investors wil] hold 
combinations of only two assets, viz. a market portfolio and a risk-free asset. This fung 
scenario is summarized in the following theorem. 


Theorem 17.5.1 (One Fund Theorem). There exists a single portfolio, M, of risky 
assets such that any efficient portfolio can be constructed as a linear combination of the 


tangent portfolio and the risk-free asset. 


Unlike with the two-fund theorem where any two efficient portfolios are sufficient, in 
this case, the tangent portfolio is a specific portfolio. 


Definition 17.5.2 (Market Potfolio). The point on the Markowitz bullet where the 
capital market line 1s tangential is said to represent the market portfolio. 


Importance of Market Portfolio 
(i) The market portfolio must contains all risky assets, for if some asset is not in it then 


it will wither and die. 
(ii) Since the market portfolio contains all risky assets, it is a completely diversified 


portfolio with no unsystematic risk. 


Theorem 17.5.2 For any expected risk-free return Ly, the weights of the capital market 
portfolio is given by 
C (m — ree) 


~ CA — pge) 


Proof. From Fig 17.9, we observe that for any point (ø, u) in the Markowitz bullet, the 
slope of the line joining (0, rf) to (a, p) is 


hail p! 


= 
= 
© 








s 
r ae 


ə = ey g ; S = 





7 n 
He Met bia Hii — bret 
n . 

g Lai jai Cij Wi; 


| gent line to the Markowitz pullet, "° 
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mw = Lrg 


Max a Sra 
(w! Cw)! 


subject to 


$ 
e U (17.21) 


The Lagrange function L: R” x R —> R is described as 


Jie 
m W— Urf 


san A T 
wi Cw)! 70 + A(1 -e w). 


L(w, À) = 


Now, solving (17.21) is same as minimizing L(w, À). So, V„L(w, A) = 0, giving, 





en (Cw) m àt (mTw ik u,6)(w" Cw) Cw) = el 


The above expression can be rewritten as 
Cw 2 
gm — (u — Hre) = NORE 


Multiplying by o, we obtain 


g?m — (u — prt )CW = ore, (17.22) 
which in turn yields Aa; 
gow! m — (u -— uw Cw = Aow e, 
Smcee w= 1, y~ wim, and o = w! Cw, we get 
A = = (17.23) 
o 
o 


The requisite value of weight vector W now follows from (17.22) and (17.23): 
Example 17 5.1 Suppose a portfolio comprises of one risk-free asset with return 0.5%, 
and three mutually independent risky assets with expected returns 1%, 2%, 3% and 
variances 1%, 1%, 1%, respectively. Determine the equation of the capital market line. 

ion gives, m! = (u1, U2, u3) = ae PA 3), Urf = 0:5; oo [oi] = 


Solution The given informat 
pay 1, D: Therefore, the weight vector of the market portfolio is given by 





13x3, 
mt el C-1(m — rfe) 5/9 








ee re © : 
she expected return and variance of the market portfolio are 





‘ i uiam “£20 
- Mis a 
See as 


O 
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ae tne)’ CWmt)* = _ N35y, 
limt = m Wmi = o Yo; Omt = ((u at) mt) P 


Thus, the equation of the capital market line 1s 


Imt ~ Hrf 





pom pe Omt 
] 35 

= a eas 
2 2 


Remark 17.5.1 Suppose the market portfolio (Omt, umt) 18 known. Then, from (17.29) = 
the equation of the capital market line 1s given by 


t — Hrf 
POREM ry 


L = Urf + 
= Hr ae 


If the investor is willing to take a positive risk o, he can earn an additional return 


(= —* o over and above the risk-free return urf to compensate the risk taken by 
Omt 


him. 

In practice there are certain assets which are listed in the stock called indez stocks. 
These limited assets are significant ones that can capture the pulse of the whole market. 
The most regularly quoted market indices are broad-base indices comprising of the stocks 
of large companies listed on a nations’ largest stock exchanges, such as the American 
Dow Jones Industrial Average and S&P 500 Index, the British FTSE 100, the French 
CAC 40, the Japanese Nikkei 225. The Bombay Stock Exchange is the largest in India, 
with over 6000 stocks listed and it accounts for over two thirds of the total trading 
volume in the country. The index stocks finally help us to compute the market portfolio 
(Omt, mt). The knowledge of the market portfolio yields the equation of capital market 
line, see Remark 17.5.1. Now suppose an investor P is willing to take risk op. Then for 
this risk, the expected return up is maximum if the point (Op, HP) lies on the capital 
market line. Thus, 

Umt — Urf zA 


HP = Hrg + 
Omt 





If we let wp = 2a then 
Omt 










Hp = WpHUmt + (1 — wp)urt. 
my 


. iga; the aer expected return on an efficient portfolio can be thought of as 
return) = (Pri ce of aes + (Price of risk) x (Amount of risk). 


ee aa, te _ Ás A 


isk 

nt 00 T € a ; i 
VE TE ti tion 8 ugges ts th ut a an investor is willing to take an 

/ Wp ~ - propo rtion oj fir nvestment in index fund and E 
t mt DI ; 


K-Jree nues men schemes. 


4. 
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We now aim to examine how an individual 
ortfolio. For this, we attempt to build a ne 
h the risk of an individual asset with the | 


asset | 3 Wi 
ehaves with respect to the market 


Onship be 
up between the expected return along 


wit 
market portfolio 
Theorem 17.5.3 Suppose the market | 


asset aj is given by portfolio is (Omt, Umt). 


The expected return of an 


| Mi a Urf i Bi (Umt = Urt), where Bi = COV (Li, himt) 


2 (17.24) 

: mt 
), Proof. Suppose an investor portfolio comprises of asset Ai with weight w and the market 
arke 


portfolio M with weight 1—w. Then the expected retur i 
are respectively given by p return and risk of the investor portfolio 


u = wu +(l- w)umt, 







| 22 22 
n | o = Wo; + (1—W) on, + 2pw(1 — W)oi0mt, (17.25) 
y | a p is the coefficient of correlation between the asset a; and the market portfolio 
As w varies, these values trace out a curve in the (o, u)-graph. It can be observed 
j from Fig 17.10 that as w passes through zero, the capital market line becomes tangent 
to the curve at M. This tangency condition can be translated into the condition that the 
s slope of the curve is equal to the slope of the capital market line at M (corresponding 
i tow = 0). 
H 
) Capital Market Line 
) 
; (Omt, Umt ) 
| Hrf (Days Ha) 
O 
Fig. 17.10. 
given by 


Now the slope of the curve at M is 


å 
ia 


du dW (ay = 0) 


\ 
= 

S 
a 
Q 


Ea b 


aE oip = 0) 
io" 


=- 2 
4 æ 
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' ny R 4 to w and computing its value at w = ! 
Differentiating (1 7.95) with respect ti O we Š 
do wo? = (1 = Woh + PIOC ~ 2w) 
—(w =Q) = FF = ()) 
dw r ( 
Cimt ~ 0 
= —<— ; Oimt = POO mt: 
Omt 
Consequently, 
x du 0) (Ui - Umt)Omt | 
— | W = -daora amatat 17.2 
do Oimt oe O nt 6) 


As discussed above, the slope of the curve needs to be equal to the slope of the 
capital market line at M, thereby yielding that 


— d 
Umt — Krf = Fiw = 0). 
do 


Omt 


The above relation along with (17.26), on simplification, yields 


Mi = br + mae 
O mt 
= krf + Bi (Umt — Hre). o 


Oimt . 
Remark 17.5.3 Here, B; = E is called the Beta of an asset. Note that, for the market 


portfolio, Pme = 1. Beta is generally calculated for individual assets using regression 
analysis. As can be observed, beta measures an asset volatility or risk in relation to 
the rest of the market. It is thus appropriately referred to as financial elasticity or 
correlated relative volatility, and it is all what is required to be known about the asset’s 
risk characteristics in CAPM formula. In other words, an investor ready to bear some 
systematic risk gets rewarded for it. For instance, if Bi = 2, it indicates that the i” asset 
ae ts expected to increase (decrease) by 2% when the market increases (decreases) by 

o. Equivalently, if the market return fluctuates over a specific range of values, the asset 


returns will fluctuate over a lar | | 
, ger range o i i | ed 
in the asset risk. ge of values. Thus, the market risk is magnifi 


; hb 
ey 
i iFa 
P Ist 
4 J ae 
3 








(Beta of the Portfolio). The overall Beta B of the portfolio 1$ the 
the Betas of the individual assets in the portfolio, with the weights 
n 


a r 
p 


a. 
Dab N a, i + 
weighted average of 
am, ° 


3 es oJ 


hat define the portfolia 7 
f iw... N bole oe : il j JO 1403 (A vA or p == . . 
ea a se. Wipi. 
ao.” i=] 


-s —a 


T a ~ 
] ' Ar 
Bg 

. A u 
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). A linear equation 
i |. a 
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ics 


river the expected return for all assets in the market is called the , 

} | stments ne ta If s BOL 

unt sah ijolto MAARE lie along this line in a beta-return space. Th É ss 
ae 2, The line 


r , 
" pietorially depte 





17.6 Summary and Additional Notes 


e In this chapter we analyzed the advantages of diversifying the total investment among 

several assets optimally so as to get a ‘decent’ return with minimum risk. The theory 
described here is mainly based on the work of Markowitz[113]. 

e The requisite terminologies were introduced in Section 2, followed by a simple case 

of two assets portfolio optimization in Section 3, to get the feel of the subject. The 

ideas of Section 3 lead us to extend the analysis to the multi asset scenario in Section 


4, 


e Section 5 continues with the multi asset portfolio optimization with the difference 


that one risk-free asset 1s included in the portfolio. This inclusion results in the 


Capital Asset Pricing model (CAPM) and a new concept of market portfolio. It was 

shown that the market portfolio can guide the investor to determine the advantage 

: of taking more risk with his investment. 

e The contents of the chapter are kept very simple with the intention to familiarize 
the readers with the basics behind portfolio optimization. However things can be 


evs ry complex in financial circuits. One needs to realize that any study related to 
and high level of understanding. Several in- 


aspects of portfolio theory can be referred to, like, 
iak [31], Cornuejols and Tiitiincu 

[133], to name a few. 

oa where portfolio optimization has gained momentum in recent years 
rent of the institutional investors, like, insurance com- 


ve 
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sion funds or mutual funds. The institutional ait pc make huge invest. 
ltaneously repay the maturity amounts to the Othe 
investors who had invested with them. For this ie ate to constant), 
rebalance their portfolio after every time frame, H oy A Sen of very small, An 
institutional investor will get some inflow of epey at a gs p Ime as the return 
from various investments that had been made in = nae et PAN and which had 
subsequently matured at f time, and also the institutiona path needs to pay the 
maturity amount to all those investors who had invested with sit and Whose fund 
have matured at the end of f—1 time. The remaining amount 1s reinvested in the 
market. The time scale involved in such asset-liability problems has been captured by 
using stochastic linear programming models (see, [88] for stochastic LEB Number 
of research papers can be found on asset-liability management, like, [146, 157, 171) 
and references therein. 
e There are several commercial packages, for instance, CPLEX, LINGO, MATLAB. 
SAS that provide lot of inbuilt functions for Portfolio Analysis. The major disap- 
pointment with all the commercial packages is that they can best generate only 
the approximate piecewise linear representation of the efficient frontier in portfolio 
optimization. With large number of assets involved, say 600-800, the performance 
of these software in computing the efficient frontier deteriorates. The MPQ (multi- 
parametric quadratic programming), programmed in Java and available in public 
domain, performs exceptionally well on large-scale applications in a reasonable time 
and yields the exact efficient frontier. For more on MPQ, we refer to Steuer et al. 
[149]. 
e The effect of introduction of transaction costs and /or different lending and borrowing 
rates in portfolio optimization theory has also been analyzed in literature. Some of 
the books mentioned above contain subject matter on this issue. 
e Several problems of mathematical finance have been modeled as optimization prob- 
lems in literature. Among them, one of the most widely studied problem is the 
pricing of the derivative securities, and in particular, financial options. The funda 
mental theorem of asset pricing showing the existence of risk neutral probability as 
been nicely proved using LPP duality in [39]. Besides the references listed above, 02° 
can also look for the books Ammann [3], Brigo and Mercurio [28] and David 45 
| for more applications of optimization problems in credit-risk models, interest rate 
vo pie, Selatility nation and other financial problems. 
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17.7 Exercises 


7.1 Suppo 
1 al el are three financial market scenarios Q = {wy ,W2,W3} with different 
pro occurrence. Consider the following table showing the returns on t # 
different stocks in these three scenarios = 


scenario || prob returnkı% _ return k2% 






j w1 0.2 —10 —3() 
W2 0.5 0 20 
W3 0.3 20 15 
| (a) What is the expected returns on the stocks? 
k 1 and the remaining 2s invested 


(b) Suppose 60% of the available fund ts invested in stoc 
in stock 2, then what 1s the expected return of the portfolio? 


(c) Compute the weights if the expected return on a portfolio is 20%. 


17.2 Consider the following data 


















nt in stock 1 and 60% in stock 2. 
mponents. What will 
in stock 1 and the 
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constructed using three securities ay, az, ay y 
ed ear Mah, ie 18%, u3 = 17%, standard deviations of rio en. 
25%, 02 = 28%, 03 = 20%, and the correlation between returns, p12 = 0.3, Dr 
Among all the attainable portfolios, find the one with minimum eee es What 
weights of the three securities in this portfolio? Also compute the expected rety 


standard deviation of this portfolio. 


17.6 Among all attainable portfolios with expected Tenn 20% constructed USING the 
data provided in exercise 17.5, find the portfolio with minimum variance. Compute the 
weights of individual assets in this portfolio. 


101s 
= 0.15, 
are the 
Tn and 


17.7 Consider the following data 


u o 
asset 1 10% 5% 
asset 2 8% 2% 
For each correlation coefficient p = —1, —0.5, 0, 0.5, 1, what is the combination of the 


two assets that yields the minimum standard deviation and what is the minimum value 
of the standard deviation? 


17.8 Compute the minimum risk portfolio for the following rate return (%) data: 


Jan Feb Mar Apr May June 


asset 1 12 10 5 yA 15 12 
asset 2 7 12 10 10 i2 15 


Also compute the expected return for the optimal portfolio. 


17.9 Consider three risky assets with the covariance matrix and expected returns (all 
data in %) as follows. 


variance - covariance matrix(C) |return(M) 





10 4 0 5 
4 12 6 6 
0 6 10 1 










Find two portfolios yielding the minimum variance. Also, determine the expected retur™ 


— = fron th vi 
i hile portfolios. Using the two fund theorem, construct the portfolio gwg the 
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Ni = 20 4 

= C=/ 20 ° 10 70 

S, 40 70 14 

he l Ans , 

rd what 15 the optimum portfolio for the investor? What is the expected return of this 
portfolio? l 

he 17.11 Consider the data of two risky assets a, ar with 1 = 12.5%, m = 10.5%, o1 = 

be 14.9%, 02 = 14%, p = 0.33. 


(a) Is it advisable to diversify the investment? If so then what composition of the assets 
will minimize the risk? 

(b) What is the minimum value of the risk? 

(c) If the risk-free rate of return is 5% then derive the equation of the capital market 
line? 


17.12 Given the following information about the one risk-free asset and three risky 
assets, find the expected return and standard deviation of the market portfolio. Also 


e | determine the equation of the capital market line. 


Lyf = 5%, Ur 14%, 2 = 8%, u3 = 20% ; 
01 = 6%, 03 = 3%, 03 = 15% O= 05 013 = 02 023 = 0.4. 
17.13 Assume that the following assets are correctly priced according to the wa 
W kei line. Derive the security market line. What 1s the expected return on a 
| with B = 2? 
| P H1 = 6%, p1 = 09; 


-a — 


u2 = 12%, B2 = 16: 


are correctly priced according to the security market 


j ts Q 
17.14 If the following two 208° o? What is the risk-free return’ 
i line, what is the return of the market portfolio? Wha 


m= 9.5%, pı = 0.8; 


m = 13.5%, b2 = 1:3. 
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18 


Mathematical Programming Applications in 
Engineering 


18.1 Introduction 


Optimization is ubiquitous, with a multitude of applications in diverse areas. Some 
examples have already been listed during the course of our discussion in almost all the 
chapters. Besides traditional areas such as linear programming, integer programming, 
and mixed integer programming, some interesting areas where optimization tools have 
been fruitfully applied, constituted the subject matter of previous chapters on semi- 
definite programming, machine learning and financial mathematics. 

It is always beneficial to complement the theoretical developments of the subject 
with some examples from real-life applications. This chapter 1s offered as a modest 
contribution in this direction. Our aim herein is to mention examples from engineering 
and life sciences that can be modeled as optimization problems. These problems can 
then be solved by theoretical tools developed in the earlier chapters. 

Going with the flavor of the book, we have kept the discussion very simple by re- 
stricting ourselves to present optimization models of the problems, rather than going 
on to discuss their solution methodologies. In some sense this chapter can be viewed as 
an icing on a cake with the purpose of attracting prospective users to appreciate the 
power of the subject in solving ‘nterdisciplinary problems. 


18.2 Optimization in VLSI Design 


From its humble beginning in the early 50’s to the manufacture of circuits CU aai 

of components today, VLSI (very large scale integration) design has brought t a 
of the main frame computer to the laptop. The tremendous growth in the area o 

_ design has been made possible by the development of sophisticated tools and softwares. 

To deal with the complexity of millions of oe VLSI design tools must be 

computationally fas near optimal designs. 
uv 7 "a “aa | ate fein in VLSI design, like system specification (based on 
a iiic Kioii peed, power, choice of fabrication technique etc.), functional 
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tween various subunits), logic design, circuit de 


. — r Sign, ci 
design (depicting relations be | | : sa daa) 
Fi s | involving preparation of wafers, de Nit 


lavout, design verification, fabrication ( Position an 
diffusion of various materials on the wafers), testing and debugging. Among these si 
the circuit layout or geometric representation of the design is one of the most challen ‘ing 
and complex processes. The circuit layout task is generally performed in two steps, y 
placement phase and a routing phase. In the placement phase, circuits cells are agsion, 
to locations on a layout grid. In the routing phase, inter connections between the cells 
are realized. We confine our discussion to the placement phase for illustrating the role 
of optimization in VLSI design. 

The placement task seeks to assign all cells of the circuit to appropriate locations 
so that cells do not overlap with each other. Each cell 7 is assigned a location (xi yi) A 
the xy-plane. The cost of the wire connecting two cells 7 and j is proportional to the 
Euclidean distance between the two cells with w; j being the weight. This problem leads 
to the following quadratic optimization formulation. 

Min py)= Y, willi- x) + lyi- yj) 


1<i<j<N 


iz. í 


subject to 


ESNA E Vi 


where N is the number of movable cells; (x, y) are the coordinates of the movable cells; 
(L, ly) and (ux, Uy) are bounds on the location of the block. We note that the objective 
function and the constraints in (18.1) are separable in variables x; and yj, i.e. we can 
solve two independent QPPs in x and y variables. 

The optimization problem (18.1) does not ensure non-overlapping placement of cells. 
Usually, a set of slots is defined, with each cell being required to be assigned to a 
separate slot. This requirement may be enforced by introducing additional constraints 
into the above formulation. Since the problems in the two sets of x and y variables are 


independent, we indicate the constraints for the problem involving only x, since the 
other problem can be formulated similarly. 


N N 
2. Xi = be xslot;. (18.2) 
i=1 iSi 











__ Constraint is called a linear slot constraint The constant xslot; indicates th 


X-coordi nate of the i th slot. The li í 
5 i T VL ULI AN a J U EN 1 JL. e linear slot . . . be lls mor 
‘ormly. H owever, this constraj en ute tie ing 

, . . 2 8 Constraint alone is not sufficient to ensure a non-overlapP 
MORUEr ta anara th. 1h $ 
~, O ensure that all cells are located in separate slots, We ”® 
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P (18.3) 


However, the addition of the above described higher order slot constraints in prob- 
lem (18.1) no longer guarantees convexity of the problem. Thus, the addition of these 
constraints would complicate the scenario considerably, thereby making the problem 
formulation no longer amenable to simple and efficient optimization techniques. 

Instead, what is done is to first solve the placement problem (18.1) along with con- 
straint (18.2). In the solution obtained, let the co-ordinates of the blocks be given by 
Xo;. We now solve the problem 


N 
2 
Min We i — Xoi) 
i=1 
N 


N 
2 
subject to y se = Y xslot;. 


i= i=1 
of shai blocks outside a chosen set of blocks are 


ig j j ets i 
This is done for different subs Oe S all QPPs and arrive at an approximate 


kept fixed. The idea is to repeatedly so 

solution to the original problem. 
Simpler versions of placement illustrate ot 

play an integral role in VLSI CAD. Once again; 


of the placement task. 


ther ways in which optimization methods 
we consider the following formulation 


N N ee 
Min So a + (yi- Yi) ) 


as 
bes which may be written more compactly 


1 
1 ptpy + SY BY, 
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eesti tt~— 
XX = l, rr aa 1. 


The readers are encouraged to write the KKT conditions for the above Qpp s 
show that an optimal solution must satisfy » and 
implying that X and Y are the eigenvectors corresponding to the non-trivial eigenvaly 
of matrix B. The reader might like to think this one out: of all the eigenvalues ee; 
ones should we choose? 





18.3 Global Routing for VLSI Standard Cells 


In section 18.2, we discussed the placement problem in which layout objects are placed 
or located based on the interconnections between them. Placing the connected objects 
closer to each other helps in the next phase of the design flow - routing, wherein the 
goal is to place wires to physically realize connections. The interconnects have impact 
on the performance, power consumption, and the area - in fact, in current technologies, 
they constitute the most significant component. ) 
Routing is usually accomplished in two phases: a global routing phase, which deter- 
mines the regions through which connections should run, and a detailed routing phase. 
By determining which routing regions a connection runs through, global routing helps 
partition the detailed routing problem into smaller sub-problems that are more amenable 
to an efficient solution. In discussing the formulation of the problem, we follow the work 
of Arebti [5]. 
p = pe A P A of pe i termed as a net. Let N; denote the number 
ai tay rine iat Pug: = net 7, and let the set of possible trees for 
ee ee j A — i) These trees can be used for routing the pins 
of variables V; is used to indicate which tree is chosen for routing, 1.¢. 


T EPEA 1, i; is chosen for routing net i 
0, otherwise. 






a t- w j 
DA 


B we global routing problem is formulated as a 0-1 LPP as follows. 


i N; 
i ar i Max $ a;V; 
na C è l _<.. 
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(18-4) 
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p f, and it is given by 


qa; = (Maximum wir : 

j ire length i ) 
A gth + 1) - ( Wire length of tree) Tj. 

The Drs? constraint in (18.4) ensures that only one tree is ch ; 

onstraint m aa ensures that the capacity limit is not exh i ae while the second 
h a problem is di austed. 

| in a “Sa to solve and the cost, in general, is prohibitive for | 

roll 7 Aditi ee SOE to a relaxed version of the above probl b dd 

the following a itional constraints and removing the requirement die V; b en = 

j be 0 or 1: 


0<V;<1 GEN (18.5) 


Note that this leads to a linear programming problem, called the ‘relaxed ILP’ 

| An interesting approach to solve problem (18.4) along with the constraints (18.5) is 
given by Raghavan [134]. The steps in Raghavan’s approach are as follows. 
Step 1. Solve the relaxed ILP. Let the optimal solution be given by V = VA 

Step 2. Choose value for V; by generating random numbers {0, 1} using a biased random 
number generator that generates 0’s with a probability on (U= VE) and 1’s with a 


probability of Ve. 
Step 3. Repeat Step 2 u 
The termination criterion 1s usually 
improvement in the solution cost. 
Step 4. Output the best feasible solution. 
The logic of Step 2 stems from ‘Bernoulli trials’. The expected value of V; is 
Ni 


e objective function after several trials is given by y a iV; Thus, 
lEz 


ained using Raghavan’s procedure is 
he problem. The described procedure can 
since it efficiently uses a LPP solution to 


der problem. 


is met, and then go to Step 4. 


ntil the termination criterion 
er of iterations of Step 2, or the 


based on the numb 


ee The 


expected value of th 


randomized solution obt 


relaxed version of t 
er ILP solvers, 
to a much har 


the expected cost of the 
the optimal cost of the 
yield substantial speedups OV 
efficiently obtain the solution 







18.4 Wire-Sizing and Buffer Sizing 


interconnect delay 1s increasingly becoming 
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ertion locations has long, been known 


re and their ins ong oe be 
er, it is well known that wire sizing is also r be 


optimizing the sizes of buffe 
jiscussion in Chu and Wong [37], and consider a 


effective in reducing delay. Howev 
effoctive approach. We follow the í 
wire-sizing and buffer sizing problems, 

Fach segment of interconnect betwe 
model, as shown in Fig 18.1. 


on two buffers is modeled by using a lumped r 








-| Resistance | — 


M 








l 








Capictance 
Capictance 


Fig. 18.1. 


The capacitance of a wire segment of width w and length / is given by c(w)l, where 
c(w) is a monotonically increasing function that denotes the wire capacitance for a 
segment of width w and unit length. 

We first consider the simple wire sizing problem. In this case, a driver (or buffer) 
with driver resistance R, is driving a load capacitance of value C; through a set of M 
interconnect segments of lengths /1,..., Im. The task is to determine the wire length and 
width of each segment. The total interconnect length is specified, say, L. The wire sizing 
problem involves minimizing the following objective function 


R Y ol + Cy) + pacah R ae Psl2 (cal ` LC 
d itl L Cili D Ca L) + 


i=1 Dye aen i=3 
Pslm (culm 
1 est) 
= zL AL +b'L + RL, 
where L a a b = bba), bj = Ryc; E (i = 1,...,M), and A* 
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1 
Min zL AL+b"L + Rul 


subject to 


M 
a rS (18.6) 


i=] 
E NOEM): 


shown that for the wire-sizing problem, the matrix A is positive definite, 


oDi QPP (18 6) i x programming problem 
ce the 6) is a conve 23 i 
aei e to reduce the delay along a long interconnect, 1t 1S usually divided m 
egments and buffers are inserted after each segment. T he sizes of the buffers By,---,PN 


for an optimal delay, since the last buffer drives the load capacitance 


need tO be chosen j f an interconnect $eg- 
; lv the capacitance O a 
(usually large), while the first one drives only p for sizes increase 


ment. In general, optimization strategies yield results in which pee A 
from the first to the last, with each buffer driving a progressively g 


Resistance Rg; 
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N N 
Min TDL + aL ae > RbkCBp + yA TB, 
2 k=0 k=0 
subject to 
M(N+1) 
li =L 


18.5 Synthesis of Antennae Array 
An important engineering application of optimization problems is the synthesis of arrays 


of antennae. 
An antenna is an electromagnetic device that can generate and receive electromag- 


netic waves. Antennas convert electromagnetic waves into electrical currents and vice 
versa. They are used in systems such as radio and television broadcasting, point-to- 
point radio communication, wireless LAN, radar, and space exploration. Antennas can 
operate in air or outer space, under water or even through soil and rock at certain 
frequencies for short distances. Physically, an antenna is an arrangement of conductors 
that generate a radiating electromagnetic field in response to an applied alternating 
voltage and the associated alternating electric current. 

Methods of synthesis and optimization are now very common to many electromag- 
netic and antenna problems. Here we describe a nonlinear optimization formulation of 
the design of the sonar transducer arrays. The presentation is taken from the research 
work of Lasdon et al. [102]. 

A sonar transducer or sensor is a device which converts the energy of the sound wave 
traveling in water (an acoustic signal) into an electric signal. 

Consider a sensor whose position is specified by the vector r = (x,y,z). Acoustic 
plane waves of wavelength À and frequency f, incident in a direction specified by the 
unit vector u = (&, ß, y), impinge upon this sensor. Using spherical coordinates, we have 


cos& = sinġ cos 0, cosß = sinġsinð, cosy = cosọġ. 


Another quantity called ‘wave number’, is defined as k = (=) u. If the incoming signals 


are assumed to have unit amplitude and zero phase at the origin, then the sensor output 
os is described by a complex number as follows. 


_ os = B(k) exp(ik"r), 
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W r 
the plane wave reaches the sensor —4 seconds before i 
Pi Crore 1t reaches origin. Associ 


with each sensor _ two adjustable parameters - a 
nd 8 steering phase r’ ks. After these Operations are ee 


given by 


omplex shading coefficient w 
lied, the sensor response rs is 


rs = wB(k) exp(irT(k- ks)). 


Let there be a linear array of N acoustic antennae placed at fixed positi 
N). The N array response, called the ‘ ee soe TEED T (j= 

Loess , Called the “beam pattern’, is obtained by summing ove 

all sensors, 1.€. o E 


N 
a(k,ks,w) = Ý wjBj(k) explir?(k-ks)). 
j=l 
Suppose we wish to minimize the beam pattern level over a given zone with the 
possibility of level constraints in other areas. The problem can be formulated as the 


following optimization problem. 


Min Max lam(w)| 
w 1<i<M 
subject to 


aw) = 1, (18.7) 


where M is the number of side lobe regions (whose levels should be as small as possible) 


at which the beam pattern is to be evaluated and 
2T E 
a (w) = alkmks,w), km = (=) ms sig, = (Chg Be Ve 


j = _ For optimization 
Um specified the direction cosines of the m-th point, (m =1,.-- _M). For op 
purpose the steering direction ks 1s Ae 
The complex functions and complex , = 
pressed in terms of the real functions an 
jon problem. 


constrained optimizat 


ables appearing in problem (18.7), when ex- 
variables, leads to the following nonlinear 


(18.8) 
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18.6 Chemical Equilibrium 


We begin with a simple example pertaining HO seis: eee ; The example jg 
taken from White et al. [164]. Chemical equilibrium is the state in which the chemica] 
activities or concentrations of the reactants and products experience no net change Over 
a period of time. Usually, this would be the state that results when the forward chemical] 
process proceeds at the same rate as its reverse reaction. | | 

The chemical equilibrium problem requires us to determine the chemical composition 
of a complex mixture, containing m different types of chemical elements, at chemical 
equilibrium. The governing principle behind chemical equilibrium is the ‘second law 
of thermodynamics’. The thermodynamic condition for chemical equilibrium is, that 
a chemical mixture held at a constant temperature and constant pressure reaches its 
chemical equilibrium state when the Gibbs free energy of the mixture is minimum. The 
problem therefore involves minimizing the Gibbs free energy of the mixture Subject to 
chemical reactions possible between the chemical elements of the mixture. 

Before describing the mathematical model of the problem, we explain the notation 
that will be used in the problem formulation. Let 

m be the total number of chemicals elements in the mixture; | 

n be the total number of chemical compounds formed after the chemical reactions; 

x; be the number of moles of compound i present in the mixture at equilibrium; 

a;; be the number of atoms of element i in a molecule of compound 3; 

b; be the number of atomic weights of element 7 in the mixture; 


n 
1 


= 
P be the total atmospheric pressure, R be the gas constant, T be the absolute tem- 
perature, and F be the Faraday constant. 
White et al. [164] formulated the chemical equilibrium problem as follows. 
n 


Min i: ar X vIn (Ž) 
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18:7 protein Folding 
We now cite an example from biological sciences that is of great interest 
s ¡mization m cape One of the most important and difficult pr S ama piiri 
+ cg and biochemistry is the protei ma a 
a ad ue y protein folding problem. The protein folding problem 
Given a Eno wit pl wane SEUEN of amino acids, predict its native or folded state 
in 3 dimensional space, i.e. predict how newly made proteins, which resemble loosel 
coiled strands and are inactive in their unfolded configurations, will fold into ema: 
shaped balls able to perform crucial tasks in a living cell. š 
Let the molecules to be folded consist of a linear sequence of n beads 41,.--,4n, 
where a; denotes the i-th bead in the primary sequence. Let l; be the length of the i-th 
bond, i.e. the distance between two nuclei of atoms of consecutive beads a; and 4j41. For 
example, the equilibrium bond length of Carbon-Nitrogen (C-N) is 1.335 A. For three 
let @; represent the ‘bond angle’ corresponding to the 
position of the third bead a;41 with respect to the line joining aj-1 and 4;. For example, 
the equilibrium bond angle of Carbon-Nitrogen-Hydrogen (C-N-H) is 118.8°, while for 
Oxygen-Carbon-Nitrogen OCN 1229". A bond angle around 109° means that 
the central atom is tetrahedral. For four consecutive beads, Ai-2, 4-1, 4i, and aj+1, let 
i; denote the ‘torsion angle’ or ‘dihedral angle’ between the two planes described by 
Aj_-7,Aj-1,4; and Ai-1, Wi, Fi+1- This is illustrated in Fig 18.3. 


consecutive beads 4j-1, ^i, 4i+1, 





Fig. 18.3. 


3n-3 with 04 = 0, ¢1 = 0, $2 = 9; and 
Æ a i On—-1 Pn-1)) CIR 1 4 : 
Let ra (abet ie potential energy function. Then, the problem of molecular 
let See involves determining the global min point of the problem 
conto 
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proposed to solve it in two stages. We describe only the first step of the Proposed se 
i in order to give readers feel for optimization problem; details may be found in [123 
In the first step, problem (18.9) is approximated by a discrete problem. Th, 
dimensional space is approximated by a 3-dimensional lattice with N cites, N > y 3 
these sites be represented by s; (j = 1,...,N). Introduce variables x;; as follows, 


SOF aby if a; is assigned to s; 
E otherwise . 


Only two types of constraints are prescribed: (i) each bead must occupy exactly on 
lattice site; (ii) at most one bead occupies each lattice site s;. . 
The first stage problem is thus formulated as the following quadratic assignment 


problem. 
n N 1 MN ss Mi N 
Min YY ay YY Y pua 
i=1 j=1 1=1 j=1 k=1 l=1 


subject to 


(18.10) 


“4; =0, G=1,...,n)(j=1,...,N). 


The term Pijkı depends only on the two beads ay, ax, and the Euclidean distance 
between their allocated sites Ils; — sill2. 

The ob jective function of problem (18.10) can be written as the total potential energy 
pa in the quadratic form f(x) = cl x + 5x7 Qx, where x € R” is a zero-one vector, 
c € R™ with entries dij, and Q is an R"N x R" real symmetric matrix. The diagonal 
elements of Q are zero. 

The appropriate choice of the potential energy function is a crucial factor in the 


entire analysis. One such function of interest is the Lennard-Jones pairwise potential 
between beads a; and ag, and is defined as 


O -2(*)) 


r 
ecific beads a; and ay. This function 
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is to minimize 
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tain problems it is desirable to weight some errors more heavily than the others. 
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This 1s ac 
to the problem 


k 
18.11) 
Min » wie; (x) = m(x). l 

1=1 f 

hers. The sum o 

much larger than ot : 

ome Gi thina Psa f the smaller errors. In 

It may happens a s will dominate the sum of er v) that tries to make 

ie a lution of (18.11) will lead e i a i other words, some 

ia vee n t tends to ignore the smaller 

the larger errors small bu 


atte, is removed by 
. This ambiguity 18 1 
bserved data are overemphasized 
outlier entries in the j 5 - nization problem. 
“considering the following optimuZ 418.12) 
Safi (x). 
; wil e;(x) | 
a Min 3 


i=1 
Fi a 








One TN 
ints where © ‘ferentia 
ifferentiable age constrained ms 
: (18 12) is not d q into the follows 
© *serve that problem (*°- verted * 
— e 2 e con 


Vi 
Aa 
{) RS 7 
Wl 


1a 






2% s 





Scanned by CamScanner 






622 Numerical Optimization with Applications 





p 
Min a wWj(E; + ni) 
i=1 
subject to 


There are other ways of considering the error. In some problems, it is important to focus 
only on the largest errors while ignoring the others. In other words, for an arbitrary 
vector x, determine the maximum absolute error, Max |e;(x)|. Thereafter, find a vector 


x that solves the following optimization problem. 


Min Max |e;(x)|. (18.13) 
i 


Such optimization models are frequently encountered in sophisticated computer aided 
design (CAD) problems. Mechanical CAD is used to design surfaces and shapes of 3- 
dimensional bodies. A smooth, mathematically defined surface is fit to the given set of 
data points in 3D. The surface is often chosen to be a typical polynomial of low degree, 
called a spline. The coefficients of the surface polynomial are determined by solving an 
optimization problem (18.1 3) 

We observe that problem ( 18.13) is not differentiable at points x where two or more 
errors é;(x) are equal. It is easy to see that problem (18.13) is equivalent to the following 
smooth nonlinear optimization problem 

Min Z 
subject to 


ACO) <z (7=1,...,p) 
E(t) 9 2 Sze (Peony) 
ZA 


Remember, e;(x) = y; — h(x, Ui), (i =1,...,p). 


18.9 Reliability Optimization 


One of the natural tasks an engineer is often faced with during the process of developing 

a new product, is to design a system such that it conforms to a set of reliability speci- 

= fications. The probability that a component survives until some specified time is called 
~EN the reliability of the component. It is obvious that the reliability of the entire system 


7 | ae ee om a ee eT IA 
depenas on the 


le Tehabuity of several components constituting that system. The aim 15 
syste 2, ~ meet the desired reliability specification while 


TEE & 
FD 


— 










7 a = i 


As 


Scanned by CamScanner 











Scanned by CamScanner 





624 Numerical Optimization with Applications 


Max Rg = 1—Ry((1 = Ri)(1 = Ra)? = (1 = Ra)(1 = Ro = (1 = RA = Ryyp 
Min Cs = 2K, R® + 2K2R5? + K3Rq? + 2K4Rj' 
subject to 

Ry 205 @e1,...,4) 

Rs 20.9 

Cs < 700. 


Here, Kı = 100, K> = 100, K3 = 200, K4 = 150, aj = 0.6 (i = 1,...,4). 

The above example is a small illustration of the role of optimization in reliability the- 
ory. The subject is vast and one often encounters multiob jective optimization problems 
set up in different settings while formulating reliability models. An interested reader 


may like to refer to [98]. 


18.10 Wireless Network System 


Another interesting and extremely useful area where optimization models have come up 
in a big way is wireless communication. 

The wireless medium is an inherently multi access medium where the transmissions 
of users interfere with each other and where the ‘channel capacity’ is time varying 
(due to user mobility, multi path, and shadowing). This causes interdependencies across 
users and network layers. In spite of these difficulties, there have been significant recent 
advances that demonstrate that wireless resources across multiple layers (such as time, 
frequency, power, link data rates, and end-user data rates), can be incorporated into a 
unified optimization framework. 

It has been witnessed that convex programming is an important tool for this opti- 
mization task. In order to deal with complex optimization problems, Lagrange duality 
is extensively used as a key tool in decomposing the problem into easily solvable com- 
ponents. At the same time, one needs to realize that convexity is often not enough 
to describe the system completely. The essential features of many wireless cross-layer 
control problems are non-convex. 

We now present a simple problem to illustrate optimization-based approach for re- 
source allocation problems in wireless systems. 

Consider a multihop wireless network with N nodes. Let £ denote the set of node 
pairs (i, j) such that transmission from node i to node j is allowed. The data rate fij of a 
link depends on power Pj; assigned to the link, and also on the interference due to the 
power assignments on other links. 

_ Let P = {Pi | (i j) € £} denote the power 


r; iF aie a g 


kimie Ns tr aA Sa 


assignment and r = fr;j|(i j) € L} i 
ompletely determined by a global powe 
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stem is stable under 


some scheduling policy. 


(0, Mel: Each user is associated with a 
or when it transmits data at rate x u 
So 


The scheduling problem. F 
- For any user rate vector x picked j 
control problem, find a scheduling policy that stabilizes the pei ere 


A system is said to be stabl 
e unde at 
node remains finite. r a scheduling policy if the queue length at each 


The mathematical model of the congestion-control problem is as follows. 


Max  } Ul) 


Xs < Ms 
subject to HEA 


Here A is the capacity region of the system and is defined as the largest set of rate 
vectors x such that for any x € A, there exists some scheduling policy that can stabilize 
the network under the offered load. 

There are different ways of describing the capacity region A, each of which leads to 
a different solution. Some of the methods describing the formulation of A are reviewed 


in [103]. 


18.11 NMR Experiment Design 
Magnetic Resonance (NMR) experiments to il- 
We now move to the area. OF gues i e NMR experiments lead to model 
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vector of variables x known as experiment variables that can be controlled. 


(i) The i 
xperimental parameters which need to 


mi i . 
(ii) The unknown parameters 0 e R” known as e 
be estimated from the observe responses. 


(iii) The observed responses denoted by y. | kä 
The quantitative model relating the above described variables is given by 





Yi = HAO) kE; (f SLs rtl) 


where €;, (i = 1,...,m) are uncorrelated normally distributed random noise sources with 


zero mean and variance 0°. 


We begin the discussion by assuming that the ‘response function’ n(x, 0) is a linear 


function of 0, i.e. 


n(x, 0) = O° f). 
Here f(x) is termed as the vector of ‘basis functions’ and is assumed to be known, 
The ‘least squares unbiased estimator’ Ô of @ is defined as 


j= min dy — 67 f(x). 


Then, 


Var(0) = - TERE) 
T(x) = X SDS. 
= 


Here T(x) is the Fisher information matrix. The Fisher information matrix is a way 

of measuring the amount of information that an observable random variable x carries 

about an unknown parameter 0, upon which the likelihood function of O depends. Note | 

that the Fisher information matrix is a symmetric and positive semi-definite. 
The likelihood function is the joint probability of the data x, conditional on the value 

of 0, as a function of 0. Since the expectation of the score is zero, the variance is simply | 

the second moment of the score, the derivative of the log of the likelihood function with | 


respect to O, i.e. 
A) 2 
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18.12 Summary and Additional Notes 


not only incorporating physical constraints impacting the et ee ipa 
mathematical aspects of the model. The latter ultimately helps us to find a 4 = 
solution methodologies for the problems. ee 


» Inthis chapter, we attempted to illustrate the breadth of applications of optimization 
theory in engineering design. Some simple problems have been taken from electric 
engineering, chemical engineering, biological sciences, reliability theory, and commu- 
nication. Although these problems are not exhaustive by any means, they provide the 
readers with a good sample of the diversity of optimization ae The chap- 

‘des only a sketch of various problems, and intentionally avoids any serious 
Oe tional a One can take a look at [46, 50, 79, 127] for more applications 


8 Se ON j ing sciences. j 
of optimization in other engimeermg acts of optimization only, but it 
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19 
Matlab Cod 
es For Some Selected Algorith 
ithms 


E- ee 


19.1 Introduction 


TLA j 

MA B, which stands for MATrix LABoratorv. is 

software package, which is used extensively in b pee Rg SL aa ee 

teractive program for numerical comput een academia and industry. It is an in- 

t ; Bia i Ae puravon and data visualization, which along with 

its programming capabilities provides a ver ful 

and engineeri Unlik Ty Usen tool for almost all areas of sciences 
gineering. Unlike other mathematical packages, such as MAPLE or MATHE- 

MATICA, MATLAB cannot perform symbolic manipulations without the use of addi- 

tional Toolboxes. It remains however, one of the leading software packages for numerical 

computation. 

MATLAB deals primarily with numbers and collections of numbers called “vectors” 

and “matrices.” The concept of variable and value is very similar to other programming 
languages. A variable is a container that holds a value. 
Create a variable and assign it a value 
In MATLAB, the value is typically a numbe 
variable has a name, such as varl. There are 
them, assign them a value, change the assigned aaa 
been assigned. omen 

By hx nd creates a variable named a and assigns it the value 7. To 

e following comme nd at the prompt (>>) and press Enter. (Note: 
execute the command, type tite ae produces this itself.) 
no need to type the prompt, the comp 
ih 7. 
MS a= 7 


r or a vector or a matrix of numbers. Each 


four things one does with variables: create 
nd use the value that has already 










y been defined 


STASA d 
Using a variable that has a © eine, just type its name at 


ab 4 To see what is the value O 


utes 2g f 
We SE y fe P 


He command prompt, 


ame of the vari- 


bd > 
mah 
i, E 
~~ Aoi’ 
> ime LIJI 


nae calculation, just type the n 
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able where ever one would like the value to appear, For instance, >> a +a adds a 


to itself. Arithmetic operations follow the conventional notation, Try the following 


>> g = 2, þh = 3, c= (a da b)’. Note that 2 Means exponentiation to the power 9 


>> d = (a — b)/(a + b) 


After each assignment, MATLAB prints out the value of variable. This can sometirnes 
be irritating. One can use a semi-colon at the end of a line in order to suppress this 





printing. Try: 
>>e=2+a+7+*b; 


Use of Function 
Whenever a function takes arguments, the arguments are enclosed in a pair of parenthe- 


ses. If there are more than one argument, they are separated by commas, and the order 
of the arguments is usually extremely important. One can find out what a function does, 
and what its arguments mean by giving the command: help name of function. 


Try >> help rand. 
$ In most of the cases, the result of using a function is a value. For example, the value 


of sqrt(25) is 5. One can use this value in the similar way as one would have used any 


other value. For example: 
>> b = cos(a) 
or 
>> c = 3.5 + cos(a + b) + sin(sqrt(a)) 


Handling of Vectors and Matrices 


MATLAB deals with collections of numbers called vectors and matrices. A vector is 
simply a list of numbers. A matrix is a rectangular array of numbers. There are many 
ways to make a vector; one of the most useful ways is to make a sequence using the 


colon (:) sequencing operator. 

Try the following commands: 

> = 1. * 10; 

ei + 10; 

>> c = 9: 0.5: 11.2 which translates as assign c to be the sequence from 9 to 11.2, 
stepping by 0.5. Note that 11.2 is not in the sequence since one can’t get to 11.2 by 
taking steps of size 0.5 starting from 9. 

To enter matrix try following command 

>> mat = [12;34; 56; 78]. 

| Note that the rows of the matrix are separated by the semi-colon (;). All rows of a 
tmatrix must be the same length. The printed form of the matrix reveals the rectangulat 


i Aan iY made ‘4 = isi Sie 
LUCLUTEeE Clearly. 
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Matlab Codes P 

ee or So 
E pee sng the vector, me Selected Algorithms 631 
>> a=1:10 
>> mean(d) average of +] ; 
>> std(d) ete: eae humbe 
>> median(d) 
>> plot(d) 
>> sum(d) 
>> plot(d, sin(d)) 


: rs in d. 
ati 
ation of the number in d 


Function Handling 


A function handle is a MATLAB value th 


indirectly. One can pass function handles in c 
functions). 


at provides a means of calling a function 
alls to other functions (often called function 


eapi = epa returns a handle to the specified MATLAB function. 
andling 
Some MATLAB functions support only vector inputs, others accept matrices. 

When the data is a vector, the result is the same whether the vector has a row wise 
or column wise orientation. 

However, when the data is a matrix, MATLAB performs calculations independently 
for each column. This means that when one passes a matrix as an argument to the 
function max, for example, the result is a row vector containing maximum data values 
for each column in the matrix. 

Note: When the data is a matrix where each row contains a data set, one must 
transpose the matrix before proceeding with the data-analysis tasks to ra the a 
sets have a column wise orientation. For example, to transpose a real matrix A, use the 


syntax A’. 
Another important operation t 


sion. If M is an invertible square matrix and ae ie, 
the solution of Mx =» and * = b/M is the so 


hat MATLAB can perform with ease is matrix divi- 
a compatible vector then x = M\b is 


es for some of the algorithms discussed in this book. 


vizati blems, we have restricted ourselves to 
n : O timization pro ; 
For writing codes for a Sethe peen done for better understanding of these codes. 
the quadratic case Oniy: 


z he modified to the general nonlinear case as well. wor 
These codes can plea fications in inputting the data by employing symbolic 
will require appropria e 

functions of matlab. 


We next present MATLAB cod 


sed to use the “help” facility of MATLAB to under- 
encour’> es 4 = ) o` 9 k e dith culties faced while coding. 
teolf and also s ty i 
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19.2 Simplex Algorithm 


% The simplex algorithm consist of four basic steps. These are (1) 
% printing simplex tableau (2) identifying column to enter (3) 

% identifying column to leave, and (4) pivoting. We introduce 

% each of these functions here. First we discuss the case when 

% artificial variables are not required and next when 

% artificial variable are required. 


function [E] = simplextableau(Y,Z,Xb,U,V, fval) 


% This function displays and prints the simplex tableau in 

the condensed form. Here, the first row corresponds to 

the indices of nonbasic variables and the last column 
corresponds to the indices of basic variables. Last row 
corresponds to the values of Z_j-c_j, and the first column 
corresponds to values of basic variables. 

Y is the set of columns of the simplex tableau which correspond 


to non basic variables 
fval is the value of the objective function corresponding 


to the current bfs Xb. 
% U (V) is a vector of indices corresponding to basic (non basic) 


DL s IL SL 3 se I ae ae 


% variables | 
% Z is a vector whose entries correspond to the values of zj-cj 


[m,n]=size(Y); 
[M,N]=size(V); 
=zeros(m+2,N+2);% creates vector of zeros 
for 1=1:N 
EC@emmrl), 1+1)=YCOs5sVG))s 
%assigns column corresponding to non basic variables from Y to E 
E(m+2,1+1)=Z(V(i)); 
end 
EC1,2:(N+1))=V; 
- E(2: Gn+1) ,N+2)=U’ ; 
—-E(2: (+1) , 1)=Xb; 


a se 


ye gan e f 4 a 

E(m+2,1)=fval; 

= N intra =h F ad d e. FS ' 
< = 








_ 
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“ EB AC:, (col+1):n)]; %adding € 


function [z2] = finap wa Codes Farg, 
a ‘ “Parting NOME Se 
% This function identifie ex 
a S 
r able i 
Ndex 


al is the index of the enterin 
9 Column 


ei; 
- for i=2:(1-1)% do for all p, 
if(E(i,z1)>0) ii 
E, ratio(k) = 
(k) ECG, 1)/EG,21).% m 
; a IN ratio if Yaj å 
: ratio(k)=Inf:%else inf E 
i end 
g end 


[r]=find(ratio==min(ratio)):% find the index 


corr to mi l 
r=max(r); min ratio 


mee z2=r(1)+1; %z2 corr to index of departing row 
return; 


% This function removes a column from matrix 


function [B]=remove(A, col) 
%col is the index of column which to be deleted from matrix A 


[m,n]=size(A) ; 


-1) from A 
B=A(:,1:(col-1));% extracting column from 1: (col-1) fro 


olumn from (col+1):n of A 


P 


m return; 


3 
E 
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634 Numerical Optimization with Applications 
% This function updates the simplex tableau using pivoting 


function [E] = pivoting(E,z1,z2,M,N) 
% E is the tableau in the condensed form printed from simplextableau.m 


% M is the number of rows 
% N is the number of columns 
% zl, z2 are the indices of columns and row respectively. 


pivot=E(z2,z1);% pivot element 


for i=1:M % do the pivoting including z_j-c_j row 
mec =22) eC. =1)) 
% leave the 1st row and the leaving vector row 
for j=1:(N-1) 
% do for all the column except the entering and last 
LEO = Zi) 
vall = E22 D/pivot; 
va Eel ZID val: 
EG, JD =E Gp —val2; 
end 
end 
end 
end 


E(z2,z1)=1/E(z2,z1);% making identity entry at the pivot entry 


for i=1:(N-1)% dividing the leaving row with the pivot element 
EG =z 1) 
E(z2,i)=E(z2,i)/pivot; 
end 
end 


for i=1:M % calculations pertaining to leaving column 
ina =Z72)a(1 =1)) 
RA- ZIDSEG 21)/ Gl*pivot) ; 
end 
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function [opt_val,y, stat 
% max/min c’*x 

% subject to: 

% A*x <= b (b >=0) 


% x =>Q 


% input cost, A and b as eollannt 
% output: value, decision wectee 
% status=1 for feasible / alterna 
% status=2 for unbounded 


vector 


te 


% function called by this function: simplextableau 
% findDepartindex, pivoting, remove | 


disp(’Enter the LPP having less than equal constraints:’) 
MaxMin =input(’Enter whether problem is Max/Min (0/1) : ’); 
Num1 = input(’Enter the Number of Constraints : °); 

Num2 = input(’Enter the Number of Variables : ’); 


% if the inputs are not passed when function is called 


% then enter them here 

if (nargin<3) 
[cost] = input(’ Ente 
% This is a column vector of cos 
[b] =input(’Enter b : Ji 
% This is a column vector =~ 
[CoeffMatrix] = input ( ane matrix, 
A = CoeffMatrix;” constrain 


r Cost Matrix: '): 
t vector 


r 
f rhs vector — verii 
. he Coefficient Matrix TER. 
input row wise 
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V = [1:Num2];*% vector containing the index of nbv variables 


% Var is the identity vector corresponding to slack variable : 


Var=eye (Num1) ; 
A = [A Var]; % augment it with identity matrix 


num _variables=size(A,2);% number of variables 
B = eye(Numl);%basis matrix 


[m,n]=size(A); 
%U is vector of Basic Variables 


kers 
for i=1:Numl 
for joLen 
val=isequal (AC: ,j),BC:,1)); 
% comparing the columns of A with identity matrix 
%and identifying the basis vectors 
if(val==1) 
UC1,k)=j; 
k=k+1; 
end 
end 
end 


num _variables=size(A,2);% number of variables 
B = eye(Num1) ;% 

C = zeros(num_variables-Num2, 1); 
cost=[cost;C];% total cost vector 

Xb = inv(B) * b;% values of basic variables 


[mbu,nbu]=size(U) ; 
for i=1:nbu % Cb is cost vector corresponding to basic variables 
% U has the indices corresponding to basic variables 


ena 1)—-cost (UC ,1):, 1); 


end 








i AO as 
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4 Matlab Codes Fo 
EA r Nomo Selecteg 


E 
o P value = inv(B) » AC SDs 
k y = [Y value]; 
end 
? x compute (Zj-Cj) 
Z(1,j)=Cb’ * YC: 
end 





















»JJ-cost(j). 


[nbv nbv]=size WV 

y this function constructs 
yin the condensed form, wh 
y to the indices of nonbasic Variables anq last 

y column corresponds to indices Of basic variables, last row 
y corresponds to values of 2_j-c_j, first column corresponds 
% to the values of basic variables 
[E]=simplextableau(Y,Z,Xb,U,v, fval) ; 


Simplex tableay 
ere the first roy Corresponds 


[M,N]=size(E) ; 


% display tableau 

disp(’ Initial simplex tableau’) : 

GN a 

disp(E) 

unbounded=0 ; 

4 B : j isfi unbounded 

Ea optimality condition 1s satisfied or 

* condition is met 

while(status==0) $ 
[c]=find(E(M,2:(N-1)) >= 
if (size(c,2)==nbv) 

Status=1; = 

= else % go for one more P 

% check for unbounde 

for k=2:(N-1) ; 

i 


) ;%checking the optimality criterion 
0); 


ration 
solution 






pad 
Ai 
we 


and t 


E ios 
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% if length of r = m-2 i.e number of basic variables -1 
disp('Problem has unbounded solution’); 
unbounded=1 ; 
status=2; 

end 

end 
end 
%if the problem is not unbounded then proceed 
if unbounded’ =1 
(minimum, index] = min(E(M,2:(N-1))); 
i = andexC.)+1; 
% find the index having minimum (Zj-Cj) 
[z2] = findDepartindex(E,z1,M,N); 
% find the departing variable 
[E] = pivoting(E,z1,z2,M,N); 
% update tableau using pivoting 
% Swapping the indices of basic and non-basic variables 


x=U(:,z2-1); 
UE 22-D=V(:,z1-1); 
ViGezl—1)=x; 


% updating basic(non basic) variable index 
E(1,2:(N-1))=V; 
E(2: (M-1) ,N)=U’ ; 
% display tableau 
disp(’ Next simplex tableau ’) 
cp --—-- See 25) 
disp(E) 
end 
end | 


end 
%if unbounded then return 


if (unbounded==1) 
y=[]; 
opt_val=1el10; 
return; 

end 


opt_val=0; 
if(unbounded==0)% optimal solution exist 
if(MaxMin==1) 


n pm aó 
ne — = Co oS na OR 
ae Ty a f tr) 
n D Gat JE 











E _ % Swapping basic and non- 
_ UC: 22-1); 
BRUC: z2- 





= end 


p. Baa 

rin i i 

e Value : %6. 24 \n 
status=1; 
fprintf (Optimal Solution . 
for i=1:Num2 ED 


end 


Matlah 
ae * ECN, 


if(size(x, 1)>0) 
¥CQ)=E(x(1)+1, 1): 
fprintf( Value Of variable % 
SIFI t present in the Ha t 





= %6.2f \n’ 
else ableau then take the value 
y(i)=0; 
fprintf(’Value of Variable %d = 9 \,: ; 
%if decision Variables are siveent y 
“then its value is zero 
end 


1,E(x(1)41, 1)) 


[c]=find(E(M,2:(N-1)) == 0): 
if (size(c,2)>0) 





disp(’Problem has Alternate Solution’) 
[minimum, index] = min(E(M,2:(N-1))); 
Z1 = index(1)+1; 

% find the index having minimum (Zj-Cj) 
[z2] = findDepartindex(E,z1,M,N); 

% find the departing variable 

(ey = pivoting(E,z1,z2,M,N); 


| sing pivoting 
% update tableau using pasic variables 


DVG D 


a ke g \ = F 
L-LJ—Ay © 
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disp(E) me 
disp(’ s=ss=s=s=========s===" ; . a 
fprintf(' Alternate Optimal Solution : \n’); 
for i=1:Num2 ) 
% tracing out the decisio , 
[x]=find(E(2: (M-1) ,N)==1); 
if(size(x,1)>9) 
y(i)=E(x(1)+1,)); a 
fprintf(’Value of variable %d = %6.2f \n i ECI, 1) 
%if it present in the optimal tableau then take the Value 


n variables 


else 
%fprintf(’ Value of variable vA =. 0 Na 
y(i)=0; os 
fprintf(’Value of variable Soe 10> \ill 4) 


%if decision variables are present at nonbasic variables 
%then its value is zero 
end 
end 


end 
end 





>r 
a! y 
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<0 
: aa 










: Jas 
(cost, A ‚b, type_var, type, ste inPlexty 
y min/max Cc’ *y ~9D)j) 
SS en 
xX (=> m 
% Ret A ( ` <, unrestricted 
ee? * and b as column ve ) 9 
LD) y output : value, decision Veet Ctor 
ii y status=2 for infeasible ang ieee a 
ed 


y status=1 for feasible 


y function called by this function: a; 
y findDepartindex, pivot ing, rem nd Simplextableau, 
(Num, Num2 ]=size (A) ; 


disp(’Enter the LPP having mixed Constraints: ’) 


if (nargin<6) 
% This is a column vector of cost vector 
type_obj =input(’Enter whether problem is Max/Min (0/1) : 


end pit 
if (nargin<5) 
for 1=1:Numl 7 
fprintf(’ Constraint No. %d ’,i) 
type(i) = input(’Enter the type <= or >= or = (1/2/3):’); 
end 
end 


if (nargin<4) 
for i=1:Num2 
fprintf(’Variable No. %d \n’,1) 
type_var(i) =input(’Enter the type = 
unrestricted(1/2/3):'); 


or >= or 


end 


Margin<3) 
_ Kost] = input('Enter C 
“This is a column vector ° 
‘oe bi an or. wear Pp ° * ie 


Pm = 


. . ’ s 
ost Matrix - h 
{ rhs vector 






T 


A S 1 
m r agi T 
h va &® 
w 
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index = find(b < 0); 

if size(index,1)~=0 
b(index) = -b(index) ; 
ACindex,:) = -ACindex, :); 


end 
% if the variable is unrestricted then introduce one more variable 





j=l; 
for i=1:Numl 
if (type_var(i)==1) 
% if the variable is <=0 then multiply by -1, 
%change in the elements of A in c 
ACG DSA); 
cost(1)=-cost(i); 
elseif (type_var(i)==3) 
AC: ,Num2+j)=-AC: ,1); 
%introduce variable with negative coefficient matrix 
%and cost vector 
% it will increase number of variables by one 
cost (Num2+j)=-cost(j); 
j=j+1; 
end 
end 
if(MaxMin==1) 
cost = -I * cost; 
end 
if isempty(type_var =3) 
V = [1:Num2]; 
% vector containing the indices of nbv variables 
k=Num2+1; 
else 
V = [1:Num2+j-1]; 
% vector containing the indices of nbv variables 
k=Num2+] ; 
end 
% get the type of constraint 
NumArtificalVar=zeros(1,Num1); 
% number of artificial variables are atmost the number of constraints 
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o zeros(num_variables- 
| Cost=[cost;C];% total cos 





yar=zeros(Numi, 1); 
if(typeG)==1) 
Var(i,1)=1; ¥, 


A = [A Var]: x 
’ au U 
else ment it with id 7 
if(type(i)==2) entity matri 
VACI; 1)=- . 
NumArti fj l; % constrain 
1ficalVar(1,4)<1. : o type >= 


% surplus vari 

abl 
VC1,k)=Num2+i : 
k=k+1; 


e Corresp 


A = [A Var] :%a 
»sAaAUugm - ‘ 
else hent atarin identity matrix 


VarG yy D-i =% Constraint of type = 


NumArtificalVar(1, i 
p= Ka ‘ 
ana ) 1;%introduce artificial variable 


end 
end 
% add artificial variables 
Kl; 
[m,n]J=size(A) ; 
Art=[1;% empty matrix for artificial variable 
for i=1:Numl 
if(NumArtificalVar(1, 1) ==) 
Var=zeros(Num1, 1); 
var, D1; 
A = [A Var]; 
Art(1,k)=ntk; 
k=k+1; 
end 
end 
num_variables=size(A,2)> 


B = eye(Num1) ;% 
Num2, 1); 


t vector 


TT 

Eons 
m,n]=size(A); 
y D] i 7 Pe tT D y ~ we cy] ac | if * 


lest Oe =p v 
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else 
for i=1:Numl 
for jelin 


l=isequal(A(:,j),BC:,4)); 
My ee the columns of A with identity matrix 


%and identifying the basis vectors 
ir(val=<]) 
UC1,k)=j; 
k=k+1; 
end 
end 
end 


end 
262626707076767676762676 NANG AGG HSS HGH HRA HSH I 


* Phase-I if the constraints are of type ==2,3 
% and normal simplex if type==1 
REBRRRRRNAAAAA RANA NAA GHGS ISHII IAS 


costl=zeros(n,1); 
[mArt ,nArt]j=size(Art): 


if nArt”=0 
for 4=1-nArt 
% COSt assigned to artificial variable 
costi(Art(1,1), D=]; 
end 
else 
€COStI=cost : 
end 
Xb = inv(B) * b;% value of basie variables 
[mbu, nbu] =size (U); 
for i=1-nbu 
if nArt~=0 
Cb(i, 1)=cost1(U(1,i),1); 
else 
Cb(i, 1)=cost(U(1,i),1): 


end 
æ = oa y 
TE. F 
d ! ; 








Matlab 
a = Code 
yalue = inv(B) * AC, i)i e 


8 For Som 
n è Sel 
wo LY value]; ected Algorithms 
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end 
% compute (Zj}-Cj) 
-zeros(1 nn); 


for j=l 
Z, j= *Y C, j)-co 
om stl(j): 
M G); 
[mbv nbv] =size(V); 
% thi 


s function constr 
y in the er aa and prints the simplex 
sands , Where the first ro tableau 
% to the indices of nonbasic variables eel corresponds 
ast 


5 


% corresponds to values of Z_j-c_j, first col 
% to the values of basic variables oim COEES ERE 


(E]=simplextableau(! Z Xo UN fval) ; 


dispC Initial simplex tableau - Phase I’) 
en, e ee n 
disp (E) 
infeasible=9; 
while(1) 
[cj=find(EM!, 2° (N-1)) 7= 9) ;%checking the optimality criterion 
at (size(c,2)==nbv) % End Phase I 
fval = E(M, 1); 
if abs (fval)>=0. 0000% stop with this epsilon tolerance 
à ‘able comes jn picture 
if ys cemmeyceaccas 
infeasible=! ; 
disp(’ LPP is infeasible’) 


y- 0; 
py J=- 1000000; 
status=4 
preak:; 
else ck variables 
obl ai olution 
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break; 


else 
fprintf(’Optimal Value : %6.2f \n’,E(M,1)) 


opt_val=E(M, 1); 
end 
fprintf('’Optimal Solution : \n’); 
for i=1:Num2 
% tracing out the decision variables 
(x]=find(E(2: (M-1) ,N)==i); 
if(size(x,1)>0) 
% value of decision variable 
y(1)=E(x(1)+1,1); 
%if it present in the optimal tableau then 
take the value from the optimal tableay 
fprintf(’Value of variable %d=%6.2£ Ww ,i,E(x(1)+1,1)) 
else 
%fprintf(’ Value of variable %d = 9 \n’ ,i) 
y(i)=0; 
%if decision variables are present at 
ænonbasic variables then its value is zero 
fprintf(’Value of variable %d = 0 \n’ {35 
end : 
end 
end 
| 


else % go for one more iteration 


[minimum, index] = min(E(M,2:(N-1))): 
% find the index having minimum (Zj-Cj) 





zl = index(1)+1; 

% find the departing variable 

22 findDepartindex(E,z1,M,N): 

% update tableau using pivoting 

fE] = Pivoting(E,z1,z2,M,N): 

% Swapping the indices basic and non-basic variables 
X=U(:,z2-1); 

UC: ,22-1)=V(: ,z1-1)- 


V(:,z1-1)=x; 





be N 7 it i | I k "y \ ; == i = 
c FRA SS > DF TaN. A i 


uae (N-1))=V; 



































| ae Por Som 
CS O a 
—— oo i 
OIA A X% 
YY ro 
4 i+ M E e a: or M66 LoL0/0 
FARE RRHANBHN OCG then go go AAAA on 
k x nfeasible==0)% if not mreana Ogg ST 
EEOC: 1);X value ot a then move to eC, 
S se IT 


% remove the artificial va 
Here it is assumed that a 
was nonbasic variables, I£ SO 


ygoing to Phase II. This par 
foe i—1 nArt 
[a]J=find(E(1, :)==Art(1,i))- 
if(size(a,2)~=0) 
[E]=remove(E,a(1)); % remove from E 
fb} =find(V (1, :)==ArE CRO 
[V]=remove(V,b(1)); % remove from V 


t can be code 


end 
end 
[M,N]=size(E) ; 
[mbu, nbu]=size(CU) ; 
@b=zeros (nbu, 1) ; 
for i=1:nbu 
Cb(i,1)=cost(U(1,1),1); 
end 
% compute new (Zj-CJ) 
for j=2:(N-1) 
Y = E(2:(M-1),)); 
ss costae 
=% compute initial V 
= _fval = Cb’*Xb; 
5 EM, 1)=fval; % updat 
| mbv nbv]=size(V) ; 


1 


— 
<i 
_ j ~i A Ai 
TAN | ePbal 
Wet Rae *S” ae 
Ss —— = 


alue of phase m 





&f matrix 
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unbounded=9; 


while(1) | 
% do iteration until all z_j-c_j are not >= 0 


(c]=find(E(M,2:(N-1)) >= 0); 
if (size(c,2)==nbv) % optimal solution 
% check for alternate solution 
[c]=find(E(M,2:(N-1)) == 9); 
if (size(c,2)>0) 
disp(’Problem has Alternate Solution’) 
end 
break; 
else 
[minimum, index] = min(E(M,2:(N-1))); 
% find the index having minimum (Zj-Cj) 
zl = index(1)+1; 
% find the departing variable 
[z2] = findDepartindex(E,z1,M,N); 
% update tableau using pivoting 
[E] = pivoting(E,z1,z2,M,N); 
% Swapping the indices of basic and non-basic variables 
K=UC? ,.z2—1)); 
UG .z2=N=aVGz1-1); 
Wis -zi-—1)=x; 
E(1,2:(N-1))=V; 
E(2:(M-1),N)=U’; 
% display tableau 
disp(’ Next simplex tableau - Phase II’) 
disp 2 s---------------------------------- 1) 
disp(E) 
% check for unbounded solution 
for k=2:(N-1) 
if(E(M,k)<0) 
% if length of r=m-2 i.e no of Basic variables -1 
% checking the sign of y_{rj} 
[rJ=find(E(2:(M-1),k) <=0 ); 
if(size(r, 1)==(M-2)) 
disp(’Problem has unbounded solution’); 
_ unbounded=1; 



































Mat 
lab Codes For Some Selected 
Algorithms 
649 





end 
end 
if Cunbounded==1) 
y=]; 
opt_val=1e1Q: 
break; i 
end 
end 
end 
opt_val=0; 


% unbounded is zero impli 
Bee hounded=-0) implies that optimal solution is achieved 
i£(MaxMin==1) 
fprintf(’ Optimal ; 7 
reale es ee An I eK) 
else 
fprint£(’ Optimal Value : %6.2f \n’,E(M,1)) 
opt_val=E(M, 1); 
end 
status=1; 
fprintf(’ Optimal Solution : 
for i=1:Num2 
% tracing out the de 
[x]=find(E (2: M-1),N) 
if(size(x, 1)>9) 


See in the optimal tableau then tak 


\n’); 


cision variables 
==1); 


e the value 
E(x(1)+1,1)) 


%if it present 1N at 
fprintf(’ Value of variable _ %6.2£ \n 51 
else l ; 7 
yfprintf( Value of variable yd = 0\n ) 
i z ' . . it 
Ers ision variables are present at nonbasic variables 
if deci 
-+o yalue 15 zero p i 
aen ue of variable ya = 9 \n’ 1D 


fprintf C val 


end 
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19.3 Dual Simplex Algorithm 


%The dual simplex method can be used to solve ) 
%any LPP provided we have an initial basic solution 
%for which all (z_j-c_j)>=0, in particular we may take 


%the LPP in the form 


% max rer 
*Subject to: 

% Aar (ee, =) b 
% x =>0 

%with c <=0 


disp(’Enter the LPP to be solved by dual simplex :’) 
Numl =input(’Enter the Number of Constraints : ’); 


Num2 = input(’Enter the Number of Variables : ’); 
[CoeffMatrix] = input(’Enter the Coefficient Matrix : ’): 
% enter the coefficient of each of the variable even 

% though it may be zero 

ip] = input(’Enter b=: %3): 
[cost] = input(’Enter Cost Matrix : 


% This is a column vector 
'); % This is a column vector 


for i=1:Numl1 
fprintf(’Constraint No. %d ’,i) 
Lye = Inpuce ter the itype <= or >= © (2/2) = “H: 


if (type==1) 
CoeffMatrix(1, :)=-1* CoeffMatrix(i,:): 


DCL) =-1* ba) 3 
end 


end 
Surplus = -1*eye(Numl1); 
A = [CoeffMatrix Surplus]; 
B =-1*eye(Num1); 
for i=1:Numl 
U(1,i) = Num2+i: 
end 


©. 
. ae 
COEDS a 
4 2 
—_ 
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for j=1:n 
Hue = inv(B) "AG, D; 
y = [Y value]; ! 
end 
y compute (Zj-Cj) which a 
% for the stated form of et ice be >=0 
7 -zeros(1,n); 
for j=1:N 
7, JEC NG Cost); 
end 
[mbv wnbv | =size(U) ; 
% constuct tableau in condensed form 
[E] _pualsimplextableau(Y Z, Xb, U.V, fval); 
[M,N] —size(E); 
% display tableau 





dispC Initial dual simplex tableau’) 
disp Ohe o o a ) 

disp (E) 

infeasible=0; 

while(1) 


[c]=findE (2: M-1), 1)>=0); 
dag (size(c, D =nbv) | 
% solution is not optimal solution 
% go for one more iteration 
% find the index having minimum 
Z2 = index(1)+1; | eee 
% find the departing varia 
mi s pualdepartinee” 
% update tableau using Pt 
j j 1,Z 
[E] = pivoting( „Zi, ae 
% Swapping the indices basic 
TEA 
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Some Select 

i elected Algorithms 67! 

y This function displays the dual Simplex tabl gorithms 653 
apleau in 


y the condensed form and prints it, where 

% corresponds to the indices of nonbasic ai ibiak 

y last column corresponds to the indices of aia and the 

y Last row corresponds to the values of 7 J-C lia ike: 
% column corresponds to the values of he haste ous 


function [E] = Dualsimplextableau(Y,Z,Xb,U,V, fval) 


ee ser of columns of the simplex tableau which correspond 
% to non basic variables 


% fval is the value of the objective function corresponding 

% to the current basic solution Xb. 

% UCV) is a vector of indices corresponding to basic(non basic) variables 
¥% Z is a vector whose entries correspond to the values of zj-cj 


Im, nJ=size(Y) ; 
[M,NJ=size(V) ; 
F-zeros(m+2,N+2); 


for i=1:N | 
E(2: (+1) ,i+1)=¥C: VI): 
E(m+2,it1)=Z(VG)); 

end 


=Xb; =fval; 
E(1 2: (N+1))=V; 5 (2: (m+1) ,N+2)=U ; E(2:(m+1),1)=Xb; E(m+2,1)=1va 


return, 
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eal WWM TL) SOLVE 


BALANCED AS 


c=input(’\nEnter Co SIGN 
St foals MENT 
(m,n]=size(c); eee );% enter o ROBLEM ANa 
ğm-no of jobs T 
gn-no of persons, here m and 
ei=c; A are equal 
% To subtract row mini 
ima 
for i=l:n from each SON 
minl=min(cl(i,:)): 
c1(1,:)=c1(i,:)-min1: 
end 
% To subtract column mini 
| lma fro 
ae m each colum 


minl=min(cl1(:,j)): 

GLC; , JJ=clC mm 
end disp(cl); 
fprintf(1,’1(0) is displayed if corresponding row or 
column is ticked(Cunticked)\n’); 


% opt is a variable which represents number of independent zeros 
opt=0; 
%loop executes until number of zeros=n 
%-10(-20) denotes encircled(crossed) zero 
while (opt<n) 
os i d crossed 
%stop when either all zeros are encircled or 
while(flag_m) 
flag_m=90; 
flags; ; 
% stop when there 1 
while(flag) 
flag=0; 
for i=1:n .)==0))==1 


If numel (find(cl@»: 


s no row left with a single zero 
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u=zeros(1,n); 





for i=l:n do not have encirc] 
qa numel (find(e1¢j ; ed ze 
udi)=1; A 
end 
end 
y=zeros(1,n): 
flag=1; 


while(fla 6 
£ ag) % stop when Chain Of tickian a 
flag=0; icking is completed 


DE numel (find((c1(: j 
A D ES=20 E ‘= 
eed )&u ==1))~=9 


end 
That numel (find((c1(: , j)==-10) & w’==1))"=6 
v(j)=1; 
end 
end 
% to tick rows that have encircled zeros in ticked columns 
for =) sn 
if numel(find(cl(i, :)==-10 & v==1)) =0 
TG 
end 
end 
fom jolen 
if numel(find((cl¢: 
flag=1; 
end ; 
if eE 12°? oe 
flag=1; 


D o & u’==1)) =O % vQ== 


==). =0ne v(j)==9 








F end 


= end 
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Ey code for branch and bo 


| Ix,val status ,bound]=branch (cost: 






M 
“tab Codes For 


ranch and B 
Br ound Methoq 
x BnB finds the Solution po 0d (Bn) 
function [x,val, Status] =bnb ¥ i 
call from command window: a 
ac[-1 3; 
E.t; 
1 0; 
© 1]; 
p=[6; 
35; 
7; 
7]; 
rost=|-/; 
-9];%take the negative for maximization problems 


=2°-24; 
type_var=2 ; 
type=1; 
[m,n]=size(A) ; 
M=[type_var* ones m, 1)];:%typr_var is 2 for >=0 variables 
AE or >= or unrestricted (1/2/3)). 

%type = vector of 1 or 2 or ete ee OS 
N=[type*ones (mn, Lis 

P=[1,2]; 

bound=inf; % the initial bo 
[x0,val0,status®] -Simplexbnb 
% 4 recursive function that proc 


(1/2/3) 


und is set to tve infinity 


(cost,A,b,M.N); 
esses the BB tree 


_ val=-val; 
sae b,x, VM y, e, bound, P) 
ao - r A 
a fur ction [xXx ’ val, status, bb] = e 

F : mn = ~ . . . j an 

T X 1s an initial solution ntegarily 





_% corresponding objective 


=: Y { M gs fi 
rash ONUs*Y 






a i - 
io = oS 2 å- ~ 
S aT 
"sae * 
w- 


-e 


Á be 4 * 


O = 
i — 


b 
= 
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% objective function is higher than the current bound 
% return with the input initial solution 
if status0<=0 | val® > bound 

xx=x; val=v; status=status0; bb=bound; 

return; 


end 
% if the integer-constraint variables turned to be 


% integers within the input tolerance return 
ind=find( abs(x0(P)-round(x0(P)))>e ); if isempty(ind) 
Status=1; 
% this solution is better than the current solution 
% hence replace 
if val® < bound 
x0(P)=round(x9(P)); 
xx=x0: 
val=valQg; 
bb=valQq; 
else 
XX=X; % return the input solution 
val=v; 
bb=bound; 
end 
return 


end - 
% 1£ we come here this means that the solution of 


% the LP relaxation is feasible and gives a less value 
% than the current bound but some of the integer-constraint 
% variables are not integers. Therefore we pick the 
% first one that is not integer and form two LP problems 
% and solve them recursively by calling the same function(branching) 
% first LP problem with the added constraint that 
Ai < Lloor@a) | a=ind(1) 
br_var=P(ind(1)); 
br_value=x(br_var) ; 
if isempty(A) 
[r c]=size(Aeq); 
else 
[x c]=size(Aa); 
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bl= [b; floor (br value) ]; 
ame (MI ; 
N1=[N;1]; 


x second LP problem wit 

x Xi >= ceil(Xi) , heje added Constraint that 
A2=(A ;zeros(1,c)]; 

A2(end,br_var)=-1; 

b2=[b; -ceil(br_value)}; 

XM=[M;2] ; 

N2=[N;2]; 


% solve the first LP problem 
[xl,vall,status1,bound1] 
status=statusl1; 
% if the solution was Successfull and 
if statusl >0 & boundl<bound 

> e e>, d E 

val=vall; 

bound=bound1; 

bb=bound1; 
else 

xx=xQ; 

val=valQ; 

bb=bound; 
end 


% solve the second LP problem : 
[x2,val2,status2, bound2 ] —branch(cost,A2,b2 x0, val0,M,N2,e, bound, P) 


% if the solution was successfull and gives a better bound 
if status2 >0 & bound2<bound 

status=status2; 

xe? = 
val=val2; 
bb=boundz2 ; 


“branch (cost, A1,b1,x0,val0,¥,1,e, bouna P): 


gives a better bound 
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% code for branch and bound. remaining functions are 


% similar to two phase method 
function [y,opt_val,status]=Simplexbnb(cost,A,b,type_var, type) 


“%Stype_var=variable type 

Stype=type of constraints 

% min c’*x Subject to: <A*x (<=, =>, =)'"b 
% x =>0 

*Status=-2 for infeasible and unbounded 

% status=1 for feasible 

MaxMin =1; 


cost=-cost; 
[Num1,Num2]=size(A); 


index = find(b < Q); 
if size(index,1)~=0 
b(index) = -b(index); 
ACindex,:) = -A(Cindex,:); 
end 


Ti% 
for i=1:Num2 
% if the variable is <=0 then multiply 
% by -1, the coeff in A in c 
if (type_var(i)==1) 
AC: ,1)=-ACG , 1); 
cost(i)=-cost(i) ; 
elseif (type_var(i)==3) 
%introduce variable with negative coefficient 
“matrix and cost vector 
AC: ,Num2+j)=-AC: ,1); 
% it will increase no of variables by one 
cost (Num2+j)=-cost(j); 
j=j+1; 
























M | tl ` 
atlab Codes For Some Sate. 
k=Num2 +1; elected Aly drith 
ma 

else 663 

v= [1:Num2+j-1]; 

x vector containing the indi 

k=Nume+j ; ndices of nhy Variables 





end 
: x get the type of constraint 
NumArtificalVar=zeros(1,Num1) - 


ber of artifici i 
% num cial variables are atmost the numb 
er of constrai 
alnts 


¥ Var is the identity vector corres 
¥ artificial variable 
for i=1:Numl 
Var=zeros(Numl1, 1); 
if(type(i)==1) 
Var(1,1)=1; % constraint of type <= 
A = [A Var]; % augment it with identity matrix 
else 
if(type(1)==2) 
Var(i,1)=-1; X constraint of type >= 
NumArtificalVar(1,i)=1; % introduce artificial variable 
% surplus variable corresponding to ith constraint 
V(1,k)=Num2+i; 
—k+1; 
A = [A Var];%augment it with identity matrix 
else Eo 
i =]: constraint 0 = 
Hsbc Mei ,i)=1;%introduce artificial variable 


end 


ponding to slack or 


> end 
= end 
=% add artificial variables 
meat: [m ni=size(A); ~ i 
 Art=[]:% empty matrix for artı 
for i=1:Num1 
if(NumArtificalVar(1,i)==)) 
Var=zeros(Num1, 1); 


a aiai m ua 







ficial variable $i 






je —— 
ia yrs ~“ wut’ | 
A _ 


=N 
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else 
Cb(i, 1)=cost(uc1, Diaby 
end 





yal = Gh “XbDt% Obj e€ective functi 


y= [];% calculation of y ey a value 
for j=1:N ni 
value = inv(B) * AC:,j); 
= [Y value]; 
end 


% compute (Zj-Cj) 
7 =zeros(1,n); 


for j=i:n 
Zo, j)=Cb’*YC: , jJ) -costi (G); 
end ` 
[mbv ,nbv]=size(V) ; 
% this function constructs and prints the simplex tableau 
% in the condensed form, where the first row corresponds 
% to the indices of nonbasic variables and last 
% column corresponds to indices of basic variables, last row 
% corresponds to values of Z _j-c_j, first column corresponds 
% to the values of basic variables 
[E]=simplextableau(Y,Z, Xb,U,V,fval); 


[M,NJ=size(E) ; 
% display tableau 
disp(’ Initial simp 


—— ee me 
Sn et ae el) Oe me ae 
eS he aaa 


lex tableau - Phase I’) 
n) 








Mat] 
ab Codeg For Some Selected 
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infeasible=0; 
iterion 


, Ee 9) checking the optimality cr 


 [c]=find (EM, 2: N- D ne = 
ERE (size(c,2)= =nbv) % End Phase I 
a a pi pore a oth 


tf’ e 
a e poa A wy 
m p A À 3 


ae ee toler 
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| 
opt_val=-1000000; 
status=-2; 
break; 


else 
%if the problem deals with slack variables 


% only then print the optimal solution 


if(MaxMin==1) 
fprintf(’ Optimal Value:%6.2£ \n’,-1 * E(M;1)) 


6pt_val=-1 * EC, 1); 


else 
fprintf(’Optimal Value : %6.2f \n’, EM, 1)) 


Opt_ Val=E(M, 1); 








end 
fprintf(’Optimal Solution : \n’); 
for 1=1:Num2 
% tracing out the decision variables 
[x]=find(E(2: (M-1) ,N)==1); 
if(size(x,1)>0) 
% value of decision variable 
y(1)=E(x(1)+1,1); 
%if it present in the optimal tableau then 
%take the value from the optimal tableau 
fprintf(’Value of variable %d=%6.2£ \n’ ,i,E(x(1)+1,1)) 
else 
%fprintf(’ Value of variable %d = 9 \n’,i) 
y(i)=0; 
%if decision variables are present at 
%nonbasic variables then its value is zero 
fprintf(’Value of variable %d = © \n’,i) 
end 
end 
end 
end 
break; 
else % go for one more iteration 
iie index] = min(E(M,2:(N-1))); 


% find the ae ndex ag minimum i ED 
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Matlab Codes For Some 


Selected A] 

i : Borith 

[E] = e Voting (E, z1, 22,M, N); ms 667 
% Swapping the indices basic l 

x=UC: , 22-1); “ae "on-basic Variables 

UG ,22-1)=V(: 21-1), 


Wty 21-1) Se: 
E(1,2:(N-1))=v: 
E(2: (M-1),N)=v’; 
% display tableay 


disp(’ Next simplex tableau - Phase Tr”) 
SOS os = a 


=e u) 
disp(E) 

end 
end 


YRRUABAAD DHAKA EEK GN A464 


ARENA AGAIN AY 36060/0¢0 
% if the problem is of type=2 or 3 


then go for Phase-IT 
LHONDHBHDDAKDLKBRKAKANNN KG 68 


ARRRPRKAWANNA SAG 640 
if(infeasible=-0)% if not infeasible then move to phase II 
Xb = E(2: (M-1),1);% value of by’s 
% remove the artificial variable column from the tableau 
%Here it is assumed that all artificial variables are zero 
%as nonbasic variables. If some artificial variable is zero 
%as a basic variable then ‘‘exchange" is to be done before 
%going to Phase II. This part can be coded separately. 
for 1=1 :nArt an 
rasan Ch Gila) =—Arat Ci fa) 
AGS ZeGa,2Z)) =0) 
[E]=remove(E,a(1)); % ‘omen from E 
j P)==Arte( 152): 
ib =fanrci Cyc 1G) 
[V]=remove(V,b(1)); % remove from 
end 
end 
[M,N]=size(E) ; 
[mbu ,nbu]=size(U) ; 
Cb=zeros(nbu, 1); 
for i=1:nbu oe 
Cb(i, 1)=cost(U(1,1), 1); 
end tim! 
% compute new (Zj-Cj) 
for j=2:CN-1) = 
= :@M-1),J2; i i 
Sa EC a yey-cost ECL j) D: 


da D a eee | 
Tet Geos An E f 






— . 
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end 
% compute initial value of phase II 
fval = Cb’*Xb; 


E(N, 1)=fval; X update E matrix 
[mbv, nbv]=size(V); 
% display tableau 
disp(’ Initial simplex tableau - Phase II’) 
disp(’ +--+ 22 o-oo ene ee ee eee enn nn- 
disp(E) 
unbounded=90; 
while(1) 
* do iteration until all z_j-c_j are not >= © 
[c]=find(E(M,2:(N-1)) >= 0); 
if (size(c,2)==nbv) % optimal solution 
% check for alternate solution 
(c]=find(E(M,2:(N-1)) == 0); 
if (size(c,2)>0) 
disp(’Problem has Alternate Solution’) 
end 
break; 
else 
[minimum,index] = min(E(M,2:(N-1))); 
% find the index having minimum (Z}-Cj) 
zl = index(1)+1; 
% find the departing variable 
[22] es findDepartindex(E,z1,M,N); 
% update tableau using pivoting 
[E] = pivoting(E,z1,z2,M,N); 
% Swapping the indices of basic and non-basic variables 
x=U(: ,z2-1); 
UC: ,22-1)=VC: , 21-1); 
VCE Ziel ae 
E(1,2:(N-1))=V; 
E(2: (M-1) ,N)=U’; 
% display tableau 
disp(’ Next simplex tableau - Phase EL") 





b « ON 3 
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Matlab Codes For Some Selected Al 
% if length Of r=m-2 len 7 
% checking the Sign of (r ir 
(r]=find(E(2:(M-1) k) eas vi 
if(size(r,1)==(M-2)) rae 
disp(’ Problem ha 
unbounded=1; 
Status=2: 
break; 


c variables m 


S unbounded Solution’): 





A end 
-n end 
end 
i f (unbounded==1) 
y=[]; 
opt_val=1e10; 
break; 
end 
end 
end 
opt_val=0; 
% unbounded is zero implies that optimal solution is achieved 
if (unbounded==0) 
if (MaxMin==1) 
fprintf(’Optimal Value : %6.2f \n’,-1 * EM, D) 
opt_val=-1 * E(M,1); 







else 
: fprintf(’Optimal Value : %6.2f \n’,E(M,1)) 
Bea opt_val=E(M, 1); 
rf end 
a eae Status=1; 
iis, fprintf(’Optimal Solution : \n’); 






for i=1:Num2 
% tracing out the decision variables 
E. [x]=find(E(2: (M-1) ,N)==1); 
>= if(size(x,1)>9) 
oe y(i)=E(x(1)+1,1); 
%if it present in the opt 
fprintf(’Value of variable %d 







he value 
j bleau then take t 
imal K Zo ENT i Eœ(D+1, D) 






E ~ ; D 
T x pan 

lai = 9\n',i 
een ahle 3 
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% the objective function taken IS a polynomial of degr 


Matlab Codes For Some Selected Algorithms 671 


6 Golden Section Rule 


 gunction x = golden_section(xl, XU, eps,y) 


% Golden Section - This function calculates the minimum 
% a unimodal function of l-variable using Golden Secti 


ee n, but the 


a % code can be appropriately modified for general case as well. 
e runctlon 1S Min a_nx*n + a_in-1}x*{n-1} ...... + als 
% Inputs: 
% xl : Lower bound of variable 
% xu : Upper bound of variable 
% eps : Epsilon Parameter (For Stopping the algorithm) 
% m=input(’the order of poly =o) 
ie tor 1=1:m+1 
9% y(i)=input(’the coeff of poly starting from constt =) 
% end 
% for eg. min x°2 in [-5,15] cal following 
% x = golden_section(-5, 15, 0.01, [0;0;1]) 
Xp = xu - 0.618 * (xu-xl); 
xq = xl + 0.618 * (xu-xl); 
re 1° 
Ep = mypoly(y,xp);% or any other function for eg, feval(’sin’ , xp) 
Eq = mypoly(y,xq); 
i= xu - xl; 
while (I1 > eps) 
Bett € Ep <= Eq) 
x= xg? 
Xq = Xp; 
ae; Xp = xu - 0.618* (xu - xl); 
J else 
iat y = Xp; 
at xp = xq; 
: AN xq = xl + 0:-618* (xu - x1); 
See ke + 1: 


Ep = mypoly(y, xp); 


Dei : 
MYDOLYCY,Xd); 
fos ae FON ND f James “4 


i Si 
i wn oe p 
f 
Dea E 
i, A 


on Rule. Here, 





j ba ey 
AE ER E 
Í ro a 4 
a -F 
Í E? 

T E 

= ra ; 
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„pest Descent Method 


me xk + alpha_ k*d_k, d_k=-grad(£(x_k)) /norm 
pte Oin. alpha k >= 0}f(x_k+alpha+kd_k) 


(grad(f(x_k))); 
un G ion x = 


steepest_descent(Q, b, c, eps) 
ste teepest- descent - This function is used to minimize 


a quadratic function of n-variables using Steepest Descent method 
The , objective function is 


sain §.5*x’Qx - b’x + c (with Q positive semi-definite) 
% for x1°2+x1x2+x2 2+x1+x2: Q= [ 2 151 2]: 
% Ti puts: 
% Q, b : Coefficients in the objective function 

eps : Epsilon parameter (For stopping rule) 
% Output: 


% x  : Optimal Solution 


% Q is the Hessian of f 


¿(here Q corresponds to Hessian of the quadratic function) 
n b= length(b) ; 


= zeros(n, Ds & x0=([0 0 5.. 6]’ 


gradf = Q*x - b; % g0 
1 = q a / norm(gradf); % d_0 
i Fj 


e 


= 
— 


T (norm(gradf) >eps) 


Mo Calculating h(alpha)=C (1) *alpha_k” 2 + C(2)*alpha_k + C(3) 
%C i is vector of coefficients of polynomial | 

Os p oro (0.5*d'*Q*x + 0.5*x’*Q*d - b’*d) . 

(0. ox *Q*x =- bor + al; a 
gene al function gondie section rule can be used to 
ilate the min, whereas for the gean case 
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KS x_k+1 = x.k + alpha_k*d_k 
x = x + alpha*d; g 
k = ktl: 


% Calculating g_k+1 
Great = Q*x - b: 


% d_k+1 = -gik+1 / ||g_k+1|| 
d = - gradf / norm(gradf); 


end 





Scanned by CamScanner 








676 Numerical Optimization with Applications 
19.9 DFP Method 
function x = dfp(Q,b) 





Sdip - This function is used to minimize a positive definite 
% Quadratic form of n-variables using Devidon Fletcher 


% Powell method. 
% The objective function is 
% Min 0.5*x'Qx - b’x, with Q positive definite 


* Inputs: 

% Q, b : Coefficients in the objective function 
* Output: 

* x : Optimal solution. 

mn = length(b); 

x = zeros{,1); 

g = Qx - b; % g0 

S = eye): ~ S_8 = I 

d = -S*g; % d © = -S_0*g_0; g_k=grad(f(x_k)): 
= = es 


while ((k < n) & (norm(g) ~= 0)) 


% Calculating h(alpha)=C(1)*alpha_k^2 + C(2)*alpha_k + C(3) 
C = [0.5*(d’*Q*d) (0.5*d’*Q*x + 0.5*x’*Q*d - DESAY.. z 


(O S -— b’*x)]; 
% alpha = golden_section(0,10000,0.001,C): 
alpha = - C(2)/(2*C(1)); 


% x_k+1 = x_k + alpha_k*d_k 


x = x + alpha*d; 
% p_k = alpha_k*d_k=(x_k+1-x_k); 
p = alpha*d; 


% g_k = g_k+1 - g_k 
g = (Q*x - b) - g; 


% Ak = (p_k*p_k’) / (p_k’*q_k) 
4 = p*p’ / (p’*q); 


Bk = - (S_k*q_k*q_k’*S_k) / (q_k’*S_k*q_k) 





: 
e 
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19.10 Frank and Wolfe’s Method 


% Determines an optimal solution of linearly constrained NLP 


via Frank and Wolfe’s method 
- The function frank_wolfe provides an optimal solution of 


% 

% 

% the problem. 
% 

% 





Min f(x) 
Set Axey, x>=0 
*Inputs: (for f(x)=x°T Qx+b*Tx 
*A, y, Q, D, xO Cinitial point) 
*Coefficients in the objective function are in the decreasing 
*order for eg. 
4TO minimize x1°2+x1x2+x2°2+x1+x2 
% Subject to x_1+x_2+2x_3=3; x_1,x_2,x_3 >= 0 we have 


%Q=[1 0.5 0.5;0.5 1 030.5 0 0.5]; 


%b=[-4;-3;-2]; 
%A=[1 1 2]; 
%y=[3]; 
AXO=[1;0;1]; 


*output: x : Optimal solution 
function x0 = frank_wolfe(Q,b,A,y,x0) 


k=0; 
[m,n]=size(A) ; 

type_obj=input(’ 1 for min 0 for max =’); 
type_var=2;%type of variables 

“typr_var is 2 for >=0 variables 

%#(<= or >= or unrestricted (1/2/3)). 

type_cons=3;%type of constraints 

type = vector of 1 or 2 or 3,i.e, <= or >= or = (1/2/3) 


M=[type_var*ones(m, 1)]; 
N=[type_cons*ones(m, 1)]; 
n = length(b); 








a 
a ESN or ata. ~ 
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th Applications 


19.11 General Comments 


1. The code for the golden section rule has been written for the polynomial case only. 
Also the codes for the steepest descent method, the conjugate gradient method, 
and Frank and Wolfe’s method have been written for the quadratic case only. These | 
codes can certainly be modified for the general nonlinear case (subject to appropriate | 
convexity assumptions) provided the functions and their gradient /Hessian are input | 
appropriately by employing symbolic functions of MATLAB. The interested readers 
are encouraged to use the “help” facility of MATLAB in this regard. 

2. Purely from the user’s point of view MATLAB has a very useful toolbox, namely 
Optimization Toolbox, which contains standard programs, e.g. linprog (for linear 
programming), quadprog (for quadratic programming) etc. 

3 The readers are encouraged to write their own codes by combining some of the 
codes given here. For example, one can write the code for Gomory’s cutting plane 
method by suitably combining the codes for the simplex method and the dual simplex 
method. 

4. An appropriate source for learning MATLAB related t 
Applied Optimization with MATLAB programming (P. Venkataraman, 2001, Wiley). 





o optimization is the text 
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simulated annealing, 538 
slack variables, 16 
soft margin classifier, 568 
solution of the game, 135 
stigmergy, 547 
strategy space, 133 
strict local max point, 274 
strict local min point, 274 
strictly concave function, 257 
strictly convex function, 257 
strictly feasible point, 421 
strictly feasible SDPP, 530 
strictly pseudoconcave function, 458 
strictly pseudoconvex function, 458 
strictly quasiconcave function, 452 
strictly quasiconvex function, 452 
strong duality for SDPP, 528 
strong duality in semi-definite program- 
ming, 531 
sufficient optimality conditions, 460 
Support vector, 565 
Support vector machine, 565 
supporting hyperplane, 62 
surplus variables, 16 
symbolic matrix, 157 
symmetric duality, 103 
synthesis antennae array problem, 616 


test data, 562 





trace function, 515 

tradeoff, 480 

training data, 562 

transhipment problem, 209 

travelling salesperson problem, 210 
twin support vector machine, 577 

two fund theorem, 595 

two person zero-sum matrix game, 133 


two phase method, 34 


unbalanced transportation problem, 183 
unbounded solution, 14 

unimodal, 327 

unimodular matrix, 156 

usable direction, 301 

usable feasible direction, 301 

utility function, 444 


value of the game, 134 
VLSI design, 609 


weak duality for semi-definite program- 
ming, 528 

weak duality theorem, 108 

weak efficient solution, 482 

weighted sum approach, 489 

wire-sizing problem, 614 

wireless network problem, 624 

Wolfe dual, 316 

Wolfe’s method, 285 
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Numerical Optimization with Applications provides a focused and 


detailed study of various numerical optimization methods and - their 
applications in Science, Engineering and Management. Apart from 
discussing standard optimization methods and their traditional applications, 


the book includes some very recent topics like Semi-definite Programming, 
Second Order Cone Programming, Evolutionary Methods and Global. 
optimization. An attempt has been made to present some modern and non- 
conventional applications of numerical optimization in the areas of Machine 
Learning, VLSI Design/ Electrical Circuits and Financial Mathematics. A 
distinctive feature of the book is also to provide basic MATLAB codes as 
building blocks for readers to develop their own codes for various algorithms 


discussed in the book. 
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